fix(paste): detect heading levels from Google Docs styled paragraphs#2178
fix(paste): detect heading levels from Google Docs styled paragraphs#2178ErickPetru wants to merge 2 commits intosuperdoc-dev:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 167f5aa820
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
packages/super-editor/src/core/inputRules/google-docs-paste/google-docs-paste.js
Outdated
Show resolved
Hide resolved
packages/super-editor/src/core/inputRules/google-docs-paste/google-docs-paste.js
Outdated
Show resolved
Hide resolved
caio-pizzol
left a comment
There was a problem hiding this comment.
nice work on this, thanks for picking it up! the code is clean, tests are thorough, and the prior bot feedback (bold false positives + list-item corruption) is both addressed in the current version.
two things worth discussing:
multi-span headings won't be detected
getHeadingStyleProps only reads styles from a single child <span> (children.length === 1). a heading that contains a link, bookmark, or line break will produce multiple child elements — and since Google Docs usually puts the styles on the spans (not the <p>), those headings will be silently skipped.
e.g. this wouldn't be detected:
<p>
<span style="font-size:20pt;font-weight:700">Heading with </span>
<a href="..."><span style="font-size:20pt;font-weight:700">a link</span></a>
</p>one option: check that all child elements share the same font-size and bold status instead of requiring exactly one span. up to you whether that's worth the added complexity for this PR or a follow-up.
minor: closest('li') instead of parentElement
the list-item filter only checks the immediate parent. the rest of the pipeline makes the same assumption (<li> > <p> direct child), so it's consistent. but !p.closest('li') would be a zero-cost safety net in case Google Docs ever wraps list content differently.
Problem
Closes #2152
Cause
Google Docs converts most heading levels to
<p>tags with inlinefont-size/font-weightstyling instead of semantic<h1>–<h6>tags. ProseMirror'sparagraph.parseDOMhas rules forh1–h6but they never fire for these styled<p>elements, so the heading level is silently dropped.Fix
convertStyledHeadingsto the Google Docs paste pipeline, running before ProseMirror parses the pasted DOM<p>elements carrying bold + a recognised font-size (checked on both the<p>itself and its first child<span>, covering both Google Docs style placements) and replaces them with the corresponding<h1>–<h5>elementh1)