Skip to content

Add PartialEq and Eq impls to the character case iterators#154287

Closed
Jules-Bertholet wants to merge 2 commits into
rust-lang:mainfrom
Jules-Bertholet:partialeq-tocharcase
Closed

Add PartialEq and Eq impls to the character case iterators#154287
Jules-Bertholet wants to merge 2 commits into
rust-lang:mainfrom
Jules-Bertholet:partialeq-tocharcase

Conversation

@Jules-Bertholet

Copy link
Copy Markdown
Contributor

Allows easily checking whether a string is in a particular case.

Adds impl PartialEq<{ToUppercase, ToLowercase, ToTitlecase, char}> for {ToUppercase, ToLowercase, ToTitlecase} and impl Eq for {ToUppercase, ToLowercase, ToTitlecase}.

@rustbot label needs-fcp T-libs-api A-unicode

Allows easily checking whether a string is in a particular case.
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 24, 2026
@rustbot

rustbot commented Mar 24, 2026

Copy link
Copy Markdown
Collaborator

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @scottmcm, libs
  • @scottmcm, libs expanded to 8 candidates
  • Random selection from Mark-Simulacrum, scottmcm

@rustbot rustbot added A-Unicode Area: Unicode needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 24, 2026
And also add a test for its claims.
@Mark-Simulacrum

Copy link
Copy Markdown
Member

Allows easily checking whether a string is in a particular case.

Isn't this done with .is_ methods on str? It feels a little confusing to compare an iterator (which might be halfway through the result or empty) with &str. Do we have precedent for Iterator / slice comparisons anywhere?

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-t-libs-api Status: Awaiting decision from T-libs-api and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 27, 2026
@Jules-Bertholet

Jules-Bertholet commented Mar 27, 2026

Copy link
Copy Markdown
Contributor Author

Isn't this done with .is_ methods on str?

Those methods do not currently exist. It might make sense to add e.g. .is_in_uppercase() and .is_in_lowercase() to str. However, titlecase is determined word-by-word, so it would have to be e.g. .is_word_in_titlecase() instead. By exposing the more primitive operation, this PR avoids that complexity, while also being more flexible for users.

Do we have precedent for Iterator / slice comparisons anywhere?

Not that I know of. However, ToUppercase/ToLowercase/ToTitlecase are already more than just iterators, as they implement Display.

@Mark-Simulacrum

Copy link
Copy Markdown
Member

Hm, okay. I think that makes some sense. Let's see what libs-api thinks (should be in their meeting queue).

@Jules-Bertholet

Copy link
Copy Markdown
Contributor Author

An additional minor benefit of the proposed API is that it would allow removing .to_string() from many doc-comment examples in char/methods.rs; making them more concise will hopefully make them more clear as well. (Will do in a follow-up if this is accepted)

@Jules-Bertholet

Jules-Bertholet commented Mar 31, 2026

Copy link
Copy Markdown
Contributor Author

It seems T-libs-api wasn't a fan of this, and had some alternative suggestions (https://hackmd.io/3iG4g-ofShaMgcP_OYHCEw#waiting-on-team-rusttf154287-Add-PartialEq-and-Eq-impls-to-the-character-case-iterators). I'll go through them one by one:

Reject this PR as written and open to:

  • is_lowercase/is_uppercase for str

I would be open to adding those. However, as I explained above, they don't extend neatly to titlecase. Also, they don't work for people who can't easily use str, e.g. because they are working in UTF16, or using a fancy data structure like ropes to store their strings. Having an API for this on char is more flexible, and allows these users to easily make use of core's data tables.

It doesn't necessarily have to be this PR's API, however. For example, we could have methods on char instead. I think the PartialEq approach is nice because it corresponds to Unicode's definitions, but that's not hugely important.

  • potentially ading eq_ignore_case methods (if they don't add too much unicode overhead and they're compiled out for people not using them)

Overhead should not be a problem. Case folding is primarily based on the lowercase mapping, so we would only need a small table for the difference. However, there are other issues:

  • Case-folding equality can't be checked character-by-character. "ßs" and "sß" case-fold to the same string ("sss"), but a naïve char-by-char equality check will consider them unequal. A .to_case_folded() method on char would be useful, but the resulting iterator perhaps should not have PartialEq<Self>, or at least should warn prominently in the docs about the footgun.
  • If you are comparing strings ignoring case, you probably also want to consider NF(K)C-equivalent strings to be equal. Adding NF(K)C normalization to the standard library would be a substantially more involved endeavor. See the Unicode spec for more info.
  • ask about the eq method on iterator

It doesn't optimize well, unfortunately.

@nia-e

nia-e commented Apr 2, 2026

Copy link
Copy Markdown
Member

We discussed this PR in this week's @rust-lang/libs-api meeting and decided against this proposal. We would, however, be open to addressing this usecase through any or all of adding is_uppercase/is_lowercase methods on str, or adding eq_ignore_case if the implementation isn't too gnarly. Alternatively, Iterators already have an eq method - that could be useful here. itertools has an assert_eq for iterators that uses it, for instance. Thanks!

@nia-e nia-e closed this Apr 2, 2026
jhpratt added a commit to jhpratt/rust that referenced this pull request Jun 2, 2026
…imulacrum

Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jun 6, 2026
…imulacrum

Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jun 6, 2026
…imulacrum

Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
jhpratt added a commit to jhpratt/rust that referenced this pull request Jun 7, 2026
…imulacrum

Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
jhpratt added a commit to jhpratt/rust that referenced this pull request Jun 7, 2026
…imulacrum

Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
jhpratt added a commit to jhpratt/rust that referenced this pull request Jun 7, 2026
…imulacrum

Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
pull Bot pushed a commit to asukaminato0721/rust-analyzer that referenced this pull request Jun 7, 2026
Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang/rust#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
pull Bot pushed a commit to xtqqczze/rust-lang-miri that referenced this pull request Jun 8, 2026
Add APIs for case folding to the standard library

[Libs-api requested these](rust-lang/rust#154287 (comment)), so here they are.

New public API (gated behind `#[feature(casefold)]`):

```rust
impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }
```

## Notes

- This only adds a negligible amount of static data to `core::unicode`. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
- No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
- I have not put any effort into optimizing `eq_ignore_case()`; there may be a more performant implementation.
- `char::eq_ignore_case()` is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Unicode Area: Unicode needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. S-waiting-on-t-libs-api Status: Awaiting decision from T-libs-api T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants