Background and motivation
It turns out it's quite common to want to search for whitespace (char.IsWhiteSpace) or things other than whitespace (!char.IsWhiteSpace). This is true not only in regex (\s and \S) (according to our nuget regex database there are ~13,000 occurrences of a regex that's simply \s) but also in many open-coded loops, e.g.
Etc. We should expose these as dedicated helpers, whether or not we're able to improve performance over a simple loop (we might be able to, for at least some kinds of input).
API Proposal
namespace System;
public static class MemoryExtensions
{
+ public static int IndexOfAnyWhiteSpace(this ReadOnlySpan<char> span);
+ public static int IndexOfAnyExceptWhiteSpace(this ReadOnlySpan<char> span);
+ public static int LastIndexOfAnyWhiteSpace(this ReadOnlySpan<char> span);
+ public static int LastIndexOfAnyExceptWhiteSpace(this ReadOnlySpan<char> span);
}
- This is only proposed for
ReadOnlySpan<char> and not also Span<char>, since the most common case by far is expected to be spans derived from strings. The existing MemoryExtensions.IsWhiteSpace is also only exposed for ReadOnlySpan<char>.
API Usage
e.g. MemoryExtensions.IsWhiteSpace could be rewritten as simply:
public static bool IsWhiteSpace(this ReadOnlySpan<char> span) => span.IndexOfAnyExceptWhiteSpace() < 0;
Alternative Designs
If we want to expose these but don't want them to be so prominent, once #68328 is implemented (assuming it sticks with the proposed design), this could instead be exposed as a static property on IndexOfAnyValues:
public static class IndexOfAnyValues
{
+ public static IndexOfAnyValues<char> WhiteSpace { get; }
}
in which case the same functionality could be achieved with:
int wsIndex = span.IndexOfAny(IndexOfAnyValues.WhiteSpace); // or IndexOfAnyExcept
The WhiteSpace property would cache a specialized concrete implementation that does what the proposed IndexOfAnyWhiteSpace would do.
Risks
No response
Background and motivation
It turns out it's quite common to want to search for whitespace (
char.IsWhiteSpace) or things other than whitespace (!char.IsWhiteSpace). This is true not only in regex (\sand\S) (according to our nuget regex database there are ~13,000 occurrences of a regex that's simply\s) but also in many open-coded loops, e.g.runtime/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.Globalization.cs
Lines 17 to 25 in 264d739
runtime/src/libraries/System.Linq.Expressions/src/System/Linq/Expressions/DebugViewWriter.cs
Lines 1194 to 1204 in 264d739
runtime/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.Trim.cs
Lines 568 to 589 in 264d739
Etc. We should expose these as dedicated helpers, whether or not we're able to improve performance over a simple loop (we might be able to, for at least some kinds of input).
API Proposal
namespace System; public static class MemoryExtensions { + public static int IndexOfAnyWhiteSpace(this ReadOnlySpan<char> span); + public static int IndexOfAnyExceptWhiteSpace(this ReadOnlySpan<char> span); + public static int LastIndexOfAnyWhiteSpace(this ReadOnlySpan<char> span); + public static int LastIndexOfAnyExceptWhiteSpace(this ReadOnlySpan<char> span); }ReadOnlySpan<char>and not alsoSpan<char>, since the most common case by far is expected to be spans derived from strings. The existing MemoryExtensions.IsWhiteSpace is also only exposed forReadOnlySpan<char>.API Usage
e.g. MemoryExtensions.IsWhiteSpace could be rewritten as simply:
Alternative Designs
If we want to expose these but don't want them to be so prominent, once #68328 is implemented (assuming it sticks with the proposed design), this could instead be exposed as a static property on
IndexOfAnyValues:public static class IndexOfAnyValues { + public static IndexOfAnyValues<char> WhiteSpace { get; } }in which case the same functionality could be achieved with:
The WhiteSpace property would cache a specialized concrete implementation that does what the proposed IndexOfAnyWhiteSpace would do.
Risks
No response