When formatting, replace inline ASCII quotes with typographic quotes#701
When formatting, replace inline ASCII quotes with typographic quotes#701gibson042 wants to merge 5 commits into
Conversation
Ref tc39/ecma262#3861 (review) Ref tc39#173 Ref tc39#317 Note that this processing cannot be scoped to individual text nodes because e.g. `a "<a href="…">binary64</a> value"` should get rewritten into `a “<a href="…">binary64</a> value”` (but the same would not be true if the medial element were block-level rather than inline). ASCII quotes are not replaced inside of HTML comments, `<code>`/`<emu-val>` elements, backtick spans (e.g., ``` `code` ```), asterisk spans (e.g., `*"string"*`), or after equals signs (as in HTML element attributes).
|
In editor call today, I asked to confirm that this auto-formatting also would have made all of the changes from tc39/ecma262#3861 for us automatically. |
|
Also, do we have tests for the formatter that we can add to? In particular, I think the negative cases will be valuable. |
|
Yes: https://github.com/tc39/ecmarkup/blob/39c401c8ba885942c142cfff513a9d5476d83640/test/formatter.ts (Including negative tests in the form of |
|
Then let's also add tests for this @gibson042. |
This was a great idea, and did in fact find some bugs (now fixed).
Done. |
|
|
||
| it('preserves ASCII quotes that are code', async () => { | ||
| await assertRoundTrips( | ||
| `<p>ASCII quotes are not replaced in <code>"code elements"</code>, <emu-val>"emu-val elements"</emu-val>, ${'`'}"backtick spans"${'`'}, or *"inline language strings"*.</p>\n`, |
There was a problem hiding this comment.
Also put a quoted HTML attribute in this test.
| }); | ||
|
|
||
| it('<script>, <style>, and <pre> whitespace is preserved', async () => { | ||
| it('<script>, <style>, and <pre> whitespace and inner ASCII quotes are preserved', async () => { |
There was a problem hiding this comment.
| it('<script>, <style>, and <pre> whitespace and inner ASCII quotes are preserved', async () => { | |
| it('<script>, <style>, <pre> whitespace, and inner ASCII quotes are preserved', async () => { |
Ref tc39/ecma262#3861 (review)
Ref #173
Ref #317
Note that this processing cannot be scoped to individual text nodes because logical quotations can span across those, e.g. a "<a href="…">binary64</a> value" should get rewritten into a “<a href="…">binary64</a> value” (but the same would not be true if the medial element were block-level rather than inline).
ASCII quotes are not replaced inside of HTML comments,
<code>/<emu-val>elements, backtick spans (e.g.,`code`), asterisk spans (e.g.,*"string"*), or after equals signs (as in HTML element attributes).This approach is not perfect, but spec source text tends to strongly avoid the sort of edge cases that would reveal its flaws (e.g., using it as an automated analog of tc39/ecma262#3861 finds only one line to change, in substring—from «If the
"to"suffix is omitted» to «If the “to” suffix is omitted»).