Skip to content

[CLEANUP] Clean extra whitespace in css selector#1398

Merged
JakeQZ merged 9 commits into
MyIntervals:mainfrom
8ctopus:cleanup-selector
Dec 8, 2025
Merged

[CLEANUP] Clean extra whitespace in css selector#1398
JakeQZ merged 9 commits into
MyIntervals:mainfrom
8ctopus:cleanup-selector

Conversation

@8ctopus

@8ctopus 8ctopus commented Nov 28, 2025

Copy link
Copy Markdown
Contributor

This PR cleans the css selector from extra whitespace:

$input = <<<CSS
p    >
    small {
font-size:.7em
}

CSS;

$document = (new \Sabberworm\CSS\Parser($input))
    ->parse();

echo $document->render(\Sabberworm\CSS\OutputFormat::createPretty());

without the PR

p    >
    small {
        font-size: .7em;
}

with the PR:

p > small {
        font-size: .7em;
}

@coveralls

coveralls commented Nov 28, 2025

Copy link
Copy Markdown

Coverage Status

coverage: 62.745% (+0.04%) from 62.706%
when pulling 22bee0f on 8ctopus:cleanup-selector
into e7dcce8 on MyIntervals:main.

@JakeQZ JakeQZ left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It's a nice idea to tidy up the whitespace in the selectors.

However, it's not as simple as it seems. We would also need to avoid touching the whitespace within attribute selectors. Although an unlikely real-world example, this change would break a selector like a[title="extra space"].

So I think this would require a more complex regex that handles quoted strings (single and double quotes) to avoid touching them. I'm not sure how a[title=extra space] is supposed to be interpreted; that would need checking up on.

If you're willing to give this a go, we would welcome the tidying functionality, with some tests added to SelectorTest to make sure it reformats only the whitespace it should. I also understand if this now seems like too big a job - instead we could convert this PR to a 'would be nice to have' feature request.

@JakeQZ

JakeQZ commented Nov 29, 2025

Copy link
Copy Markdown
Collaborator

If you're willing to give this a go

DeclarationBlock::parse() may be the place to look for removing whitespace, though I'm not sure how.

I also note that setSelectors would fail for a[title="this, and that"], though the parser would process it fine.

@8ctopus, whether or not you proceed with this PR, I thank you for helping us identify some gremlins :/

@8ctopus

8ctopus commented Dec 2, 2025

Copy link
Copy Markdown
Contributor Author

@JakeQZ Give me a few days, I'll see what I can come up with as a solution.

@JakeQZ

JakeQZ commented Dec 2, 2025

Copy link
Copy Markdown
Collaborator

@JakeQZ Give me a few days, I'll see what I can come up with as a solution.

Thanks. There are some comments in #1324 that may be useful.

@8ctopus

8ctopus commented Dec 5, 2025

Copy link
Copy Markdown
Contributor Author

I've come up with a solution for you to look at when you have time. Unfortunately, not elegant, but all tests do pass. I couldn't figure out how to create a regex that only takes two or more spaces that are not within an attribute selector (negative look behind only supports the characters immediately beforehand, therefore not possible to use it).

Comment thread tests/Unit/Property/SelectorTest.php Outdated
*/
public function cleanupSpacesWithinSelector(): void
{
$selector = <<<'CSS'

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use heredoc - indented heredoc doesn't work with with PHP 7.2 (though it works with 7.3 upwards), and it makes autoformatting the code more of a hassle.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, could you please switch to a literal, quotes-enclosed string?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for bringing this to my attention, it's fixed.

@JakeQZ JakeQZ left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've come up with a solution for you to look at when you have time. Unfortunately, not elegant, but all tests do pass. I couldn't figure out how to create a regex that only takes two or more spaces that are not within an attribute selector (negative look behind only supports the characters immediately beforehand, therefore not possible to use it).

Yeah, completely solving this with regex may not be straightfoward, nor the long-term way forward.

We have a couple of open issues in this area: #1324, #1325.

What you've done goes some of the way, and the tests you've added will help ensure tightness when we refactor the code.

I've suggested also replacing single 'space' characters (such as newlines or tabs) with actual spaces. Would you be willing to create tests for those cases? And resolve the issues raised by @oliverklee?

Thanks for taking the time to contribute; much appreciated <3.

@JakeQZ JakeQZ left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I failed to save the inline code comment. Here it is.

Comment thread src/Property/Selector.php Outdated
$this->selector = \trim($selector);
$selector = \trim($selector);

$this->selector = !str_contains($selector, '[') ? preg_replace('/\s{2,}/', ' ', $selector) : $selector;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The \s needs double-escaping (it does technically work single-escaped, but fails a coding style check).

We could also convert any whitespace to an actual space - e.g. a single newline, tab, etc. - and have a test for that too.

And for better performance (and to avoid catastrophic backtracking in the unlikely case of a selector string with a million spaces being passed), we can use the possessive quantifier (+).

Suggested change
$this->selector = !str_contains($selector, '[') ? preg_replace('/\s{2,}/', ' ', $selector) : $selector;
$this->selector = !str_contains($selector, '[') ? preg_replace('/\\s++/', ' ', $selector) : $selector;

@JakeQZ JakeQZ left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've suggested a couple of improvements to the test method names, so that they more accurately describe what the class does (or is expected to do).

Comment thread tests/Unit/Property/SelectorTest.php Outdated
/**
* @test
*/
public function cleanupSpacesWithinSelector(): void

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function cleanupSpacesWithinSelector(): void
public function cleansUpSpacesWithinSelector(): void

Comment thread tests/Unit/Property/SelectorTest.php Outdated
/**
* @test
*/
public function doNotCleanupSpacesWithinAttributeSelector(): void

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function doNotCleanupSpacesWithinAttributeSelector(): void
public function doesNotCleanupSpacesWithinAttributeSelector(): void

@8ctopus

8ctopus commented Dec 6, 2025

Copy link
Copy Markdown
Contributor Author

Thank you again for the valuable input, I've made all changes requested, please review when you have the time.

@JakeQZ JakeQZ left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for the valuable input, I've made all changes requested, please review when you have the time.

Thanks. This is looking good now. I like the tests.

There are a couple of further issues to tidy/fix (see specific comments).

Also, could you add an entry to the changelog, probably in the 'Changed' section. (New entries should go at the top of the list.)

Comment thread src/Property/Selector.php Outdated
$this->selector = \trim($selector);
$selector = \trim($selector);

$this->selector = !str_contains($selector, '[') ? preg_replace('/\\s++/', ' ', $selector) : $selector;

@JakeQZ JakeQZ Dec 6, 2025

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str_contains requires PHP 8, whereas the library supports PHP from 7.2, so I think we'll need to use strpos instead (for now). Also, we're using \ prefix when calling a global function (for clarity).

EDIT: I just realized I originally got the logic the wrong way round in the suggestion. So I think it would be good to use a local variable to make it clear what's happening. Thinking more, a comment explaining why whitespace isn't removed when there's an attribute selector wouldn't go amiss.

Suggested change
$this->selector = !str_contains($selector, '[') ? preg_replace('/\\s++/', ' ', $selector) : $selector;
// Whitespace can't be adjusted within an attribute selector, as it would change its meaning
$hasAttribute = \strpos($selector, '[') !== false;
$this->selector = $hasAttribute ? $selector : preg_replace('/\\s++/', ' ', $selector);

Comment thread tests/Unit/Property/SelectorTest.php Outdated
Comment on lines +174 to +176
$selector = "p
>
small";

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To improve readability of the code, it would be better to use \n in the string rather than an actual newline (so that we don't have contents of the string indented less than the main code).

Suggested change
$selector = "p
>
small";
$selector = "p\n>\nsmall";

@8ctopus

8ctopus commented Dec 8, 2025

Copy link
Copy Markdown
Contributor Author

All changes done, thank you for the feedback.

@oliverklee

Copy link
Copy Markdown
Collaborator

@8ctopus Thanks! Now there's a merge conflict in the changelog file. Could you please rebase on main (rebase, please don't merge main into this branch), resolve the merge conflict and repush? Thanks!

@8ctopus

8ctopus commented Dec 8, 2025

Copy link
Copy Markdown
Contributor Author

done

@JakeQZ JakeQZ left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now. Thanks for contributing <3.

@JakeQZ JakeQZ merged commit 49bf66f into MyIntervals:main Dec 8, 2025
23 checks passed
JakeQZ added a commit that referenced this pull request Dec 9, 2025
This now does the selector-parsing that `parse()` used to do itself.

Should help with #1324 and
#1398 (comment)
oliverklee pushed a commit that referenced this pull request Dec 9, 2025
This now does the selector-parsing that `parse()` used to do itself.

Should help with #1324 and
#1398 (comment)
@8ctopus 8ctopus deleted the cleanup-selector branch December 9, 2025 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants