Repro
Test doc: https://docs.google.com/document/d/1YKyqqH8wZa3kSnoBEdlwAumI94gRivSsZB1qvc9y4CA (title: sample3)
The doc contains a simple 6-row markdown table with a bold header row:
| **Screen Reader** | **Responses** | **Share** |
Data rows (JAWS / NVDA / etc.) are plain text.
Edit applied to the pulled markdown (JAWS row only, no bold added):
- Before:
| JAWS | 853 | 49% |
- After:
| JAWS | 900 | 52% |
Push, then re-pull.
Expected
Only the two edited cells (853 -> 900, 49% -> 52%) update. JAWS and the formatting of the whole row stay plain text.
Actual
After re-pull, the entire JAWS row is bold:
| **JAWS** | **900** | **52%** |
The unchanged JAWS cell has become bold too. This breaks the table's visual layout and silently mutates content that wasn't edited.
Investigation / root cause mechanism
The reconciler does have an in-place cell-content diff path — _diff_table in extradoc/src/extradoc/diffmerge/diff.py:1131 calls _diff_tables_structural (row/col inserts/deletes) and then iterates get_matched_rows to emit per-cell UpdateBodyContentOps for matched rows. So in principle, edits inside a cell should become deleteContentRange + insertText against the existing cell, not a row replacement.
The bug is in the fuzzy row matcher. _fuzzy_lcs_indices in extradoc/src/extradoc/diffmerge/table_diff.py:188 matches rows by Recall similarity on the set of cell-text hashes with match_threshold=0.5:
def _recall(b_set, d_set):
if not b_set:
return 1.0
return len(b_set & d_set) / len(b_set)
For the JAWS row, base cell texts = {\"JAWS\", \"853\", \"49%\"}, desired cell texts = {\"JAWS\", \"900\", \"52%\"}. Intersection = {\"JAWS\"}, so recall = 1/3 ≈ 0.333, which is below the 0.5 threshold. The row fails to match.
Consequence:
diff_tables treats the base JAWS row as deleted and the desired JAWS row as a new insert (table_diff.py:325-331).
DeleteTableRowOp is emitted, followed by InsertTableRowOp with new_cell_texts=[\"JAWS\", \"900\", \"52%\"] (table_diff.py:454-526).
- At lower time (
extradoc/src/extradoc/reconcile_v3/lower.py:626, 3552), this becomes an insertTableRow request, which creates an empty row. Per Google Docs API behavior, cells in a newly-inserted row inherit text style from the adjacent row — which in this doc is the bold header row. Then insertText populates the empty cells, and the inserted text picks up the inherited bold styling.
- Because there is no
UpdateBodyContentOp for the deleted row (it is no longer a matched row), nothing clears the inherited bold.
In summary: cell edits that touch 2 of 3 cells in a 3-column row (or any edit where < 50% of the base cell-text hashes survive) silently fall off the fuzzy-LCS path and get rewritten as a delete-row + insert-row pair. The inserted row then inherits styling from whatever row happens to be adjacent in the base table.
Notes for a fix (not in scope for this issue)
Options worth considering:
- Lower / remove the fuzzy LCS recall threshold for simple tables, or fall back to positional row matching when row counts are unchanged on both sides.
- When emitting
InsertTableRowOp, explicitly clear text style on the newly-created cells before running insertText, so they don't inherit from the adjacent row.
- Prefer emitting per-cell
UpdateBodyContentOps over row delete+insert whenever base and desired row counts are equal — the structural change is not needed.
Environment
- main @ f252136
- File:
extradoc/src/extradoc/diffmerge/table_diff.py (_fuzzy_lcs_indices, threshold 0.5)
- File:
extradoc/src/extradoc/diffmerge/diff.py:1131 (_diff_table)
- File:
extradoc/src/extradoc/reconcile_v3/lower.py:3552 (insertTableRow lowering)
Repro
Test doc: https://docs.google.com/document/d/1YKyqqH8wZa3kSnoBEdlwAumI94gRivSsZB1qvc9y4CA (title: sample3)
The doc contains a simple 6-row markdown table with a bold header row:
Data rows (JAWS / NVDA / etc.) are plain text.
Edit applied to the pulled markdown (JAWS row only, no bold added):
| JAWS | 853 | 49% || JAWS | 900 | 52% |Push, then re-pull.
Expected
Only the two edited cells (
853 -> 900,49% -> 52%) update.JAWSand the formatting of the whole row stay plain text.Actual
After re-pull, the entire JAWS row is bold:
The unchanged
JAWScell has become bold too. This breaks the table's visual layout and silently mutates content that wasn't edited.Investigation / root cause mechanism
The reconciler does have an in-place cell-content diff path —
_diff_tableinextradoc/src/extradoc/diffmerge/diff.py:1131calls_diff_tables_structural(row/col inserts/deletes) and then iteratesget_matched_rowsto emit per-cellUpdateBodyContentOps for matched rows. So in principle, edits inside a cell should becomedeleteContentRange+insertTextagainst the existing cell, not a row replacement.The bug is in the fuzzy row matcher.
_fuzzy_lcs_indicesinextradoc/src/extradoc/diffmerge/table_diff.py:188matches rows by Recall similarity on the set of cell-text hashes withmatch_threshold=0.5:For the JAWS row, base cell texts =
{\"JAWS\", \"853\", \"49%\"}, desired cell texts ={\"JAWS\", \"900\", \"52%\"}. Intersection ={\"JAWS\"}, so recall = 1/3 ≈ 0.333, which is below the 0.5 threshold. The row fails to match.Consequence:
diff_tablestreats the base JAWS row as deleted and the desired JAWS row as a new insert (table_diff.py:325-331).DeleteTableRowOpis emitted, followed byInsertTableRowOpwithnew_cell_texts=[\"JAWS\", \"900\", \"52%\"](table_diff.py:454-526).extradoc/src/extradoc/reconcile_v3/lower.py:626, 3552), this becomes aninsertTableRowrequest, which creates an empty row. Per Google Docs API behavior, cells in a newly-inserted row inherit text style from the adjacent row — which in this doc is the bold header row. TheninsertTextpopulates the empty cells, and the inserted text picks up the inherited bold styling.UpdateBodyContentOpfor the deleted row (it is no longer a matched row), nothing clears the inherited bold.In summary: cell edits that touch 2 of 3 cells in a 3-column row (or any edit where < 50% of the base cell-text hashes survive) silently fall off the fuzzy-LCS path and get rewritten as a delete-row + insert-row pair. The inserted row then inherits styling from whatever row happens to be adjacent in the base table.
Notes for a fix (not in scope for this issue)
Options worth considering:
InsertTableRowOp, explicitly clear text style on the newly-created cells before runninginsertText, so they don't inherit from the adjacent row.UpdateBodyContentOps over row delete+insert whenever base and desired row counts are equal — the structural change is not needed.Environment
extradoc/src/extradoc/diffmerge/table_diff.py(_fuzzy_lcs_indices, threshold 0.5)extradoc/src/extradoc/diffmerge/diff.py:1131(_diff_table)extradoc/src/extradoc/reconcile_v3/lower.py:3552(insertTableRowlowering)