feat: DOCX export enhancements - configuration options, formatting fixes, and code centralization#273
Conversation
- Add asterisk (*) markers next to corresponding authors in author list - Add 'Correspondence:' section after affiliations with contact info - Support both plain text email and base64-encoded email64 fields - Handle multiple corresponding authors with semicolon separation Fixes issue where DOCX export was not indicating corresponding authors even though the metadata contained the information.
Add new configuration option `docx.hide_highlighting` to control whether colored highlighting is applied to references in DOCX output. Changes: - Add hide_highlighting parameter to DOCXWriter.write() method - Create _apply_highlight() helper to conditionally apply highlights - Update all highlight applications to use helper method - Add hide_highlighting config loading in DOCXExporter - Fix heading colors to always be black (not blue) - Default: false (highlighting enabled for backward compatibility) This allows users to generate clean DOCX files without colored highlighting while maintaining the feature for users who want it. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Code Review: PR #273 - hide_highlighting Config OptionSummaryThis PR adds a configuration option to disable colored highlighting in DOCX exports, which is useful for clean manuscript submissions. The implementation is well-structured and maintains backward compatibility. ✅ Strengths1. Clean Architecture
2. Backward Compatibility
3. Code Quality
4. Bug Fix Included
5. Testing
🔍 Observations & Suggestions1. Test Coverage Enhancement def test_hide_highlighting_option(sample_manuscript_path, tmp_path):
"""Test that hide_highlighting disables all colored highlights."""
# Export with highlighting enabled
exporter_with = DocxExporter(
manuscript_path=sample_manuscript_path,
# hide_highlighting defaults to False
)
result_with = exporter_with.export()
doc_with = Document(str(result_with))
# Export with highlighting disabled
# Note: This requires manuscript config or exporter config support
exporter_without = DocxExporter(
manuscript_path=sample_manuscript_path,
config={"docx": {"hide_highlighting": True}}
)
result_without = exporter_without.export()
doc_without = Document(str(result_without))
# Check that highlighted runs exist in first but not second
has_highlights_with = any(
run.font.highlight_color is not None
for p in doc_with.paragraphs
for run in p.runs
)
has_highlights_without = any(
run.font.highlight_color is not None
for p in doc_without.paragraphs
for run in p.runs
)
assert has_highlights_with, "Expected highlights with default config"
assert not has_highlights_without, "Expected no highlights with hide_highlighting=true"2. Duplicate Highlight Check if run_data.get("highlight_yellow"):
self._apply_highlight(run, WD_COLOR_INDEX.YELLOW)
run.font.size = Pt(10)
if run_data.get("highlight_yellow"): # Duplicate!
self._apply_highlight(run, WD_COLOR_INDEX.YELLOW)This appears to be an existing bug that should be fixed (possibly in a follow-up PR). 3. Documentation
4. Error Handling in def _apply_highlight(self, run, color: WD_COLOR_INDEX):
"""Apply highlight color to a run, unless highlighting is disabled."""
if not self.hide_highlighting:
if color is not None: # Guard against None colors
run.font.highlight_color = color🛡️ Security Considerations✅ No security concerns identified
🚀 Performance Considerations✅ Excellent performance characteristics
📊 Code Metrics
🎯 Final RecommendationAPPROVE with minor suggestions This is a well-implemented feature that:
Suggested Follow-ups (not blocking):
📋 Checklist
Great work! This PR is ready to merge. Review conducted following rxiv-maker CLAUDE.md guidelines |
There was a problem hiding this comment.
Pull request overview
This PR adds a new hide_highlighting configuration option for DOCX exports to control whether colored highlighting is applied to references and citations. This feature provides users with flexibility to generate clean manuscripts for journal submissions while maintaining the default behavior of showing highlights for internal reviews.
Key changes:
- Added
hide_highlightingparameter to control colored highlighting in DOCX exports - Introduced
_apply_highlight()helper method to conditionally apply highlights based on configuration - Fixed heading colors to always render as black instead of blue
- Added corresponding author feature with asterisk markers and email display (including base64 decoding support)
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/rxiv_maker/exporters/docx_writer.py |
Implements the core highlighting control logic with _apply_highlight() method, adds corresponding author display feature, fixes heading colors to black, and updates all highlight applications throughout the document |
src/rxiv_maker/exporters/docx_exporter.py |
Loads hide_highlighting config from YAML and passes it to the DocxWriter |
tests/fixtures/docx_export/sample_manuscript/2025__unknown_et_al__rxiv.docx |
Binary DOCX test fixture updated to reflect new features |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| corr_para.add_run(name) | ||
|
|
||
| corr_para.paragraph_format.space_after = Pt(12) | ||
| corr_para.runs[-1].font.size = Pt(10) |
There was a problem hiding this comment.
Setting font size on corr_para.runs[-1] is fragile because it assumes the last run exists and targets only the last piece of text. If multiple corresponding authors are present, only the last author's email will have the font size set to 10pt, while the others remain at default size. Consider applying the font size to all runs in the paragraph or to a specific style for consistency.
| corr_para.runs[-1].font.size = Pt(10) | |
| for run in corr_para.runs: | |
| run.font.size = Pt(10) |
| try: | ||
| email = base64.b64decode(email64).decode("utf-8") | ||
| except Exception: | ||
| email = "" |
There was a problem hiding this comment.
The Exception catch on line 298 is overly broad. It's better practice to catch specific exceptions (e.g., binascii.Error, UnicodeDecodeError) to avoid silently masking unexpected errors. This makes debugging easier and prevents hiding genuine issues.
| def _apply_highlight(self, run, color: WD_COLOR_INDEX): | ||
| """Apply highlight color to a run, unless highlighting is disabled. | ||
|
|
||
| Args: | ||
| run: The run object to apply highlighting to | ||
| color: The WD_COLOR_INDEX color to apply | ||
| """ | ||
| if not self.hide_highlighting: | ||
| run.font.highlight_color = color |
There was a problem hiding this comment.
The new hide_highlighting feature lacks test coverage. While the PR mentions that "all 11 DOCX integration tests pass," there are no tests that specifically verify the highlighting behavior when hide_highlighting is True vs False. Consider adding tests that:
- Verify highlights are applied when
hide_highlighting=False(default) - Verify highlights are NOT applied when
hide_highlighting=True - Test that references, citations, and comments respect this setting
| # Add corresponding author information if any | ||
| corresponding_authors = [a for a in authors if a.get("corresponding_author", False)] | ||
| if corresponding_authors: | ||
| corr_para = doc.add_paragraph() | ||
| corr_marker = corr_para.add_run("*") | ||
| corr_marker.font.superscript = True | ||
| corr_para.add_run(" Correspondence: ") | ||
|
|
||
| for i, author in enumerate(corresponding_authors): | ||
| if i > 0: | ||
| corr_para.add_run("; ") | ||
|
|
||
| name = author.get("name", "") | ||
| email = author.get("email", "") | ||
|
|
||
| # Decode email if it's base64 encoded | ||
| if not email: | ||
| email64 = author.get("email64", "") | ||
| if email64: | ||
| try: | ||
| email = base64.b64decode(email64).decode("utf-8") | ||
| except Exception: | ||
| email = "" | ||
|
|
||
| if email: | ||
| corr_para.add_run(f"{name} ({email})") | ||
| else: | ||
| corr_para.add_run(name) | ||
|
|
||
| corr_para.paragraph_format.space_after = Pt(12) | ||
| corr_para.runs[-1].font.size = Pt(10) |
There was a problem hiding this comment.
The corresponding author feature lacks test coverage. While there are existing tests for corresponding_author in other contexts, there are no DOCX-specific tests that verify:
- The asterisk marker appears next to corresponding authors
- The correspondence line is formatted correctly with email addresses
- Base64-encoded emails are properly decoded
- Multiple corresponding authors are handled correctly with semicolon separation
| if run_data.get("highlight_yellow"): | ||
| run.font.highlight_color = WD_COLOR_INDEX.YELLOW | ||
| self._apply_highlight(run, WD_COLOR_INDEX.YELLOW) |
There was a problem hiding this comment.
Duplicate condition: highlight_yellow is checked and applied twice in consecutive lines (499-500 and 502-503). The second check on line 502 is redundant and should be removed.
Standardize font sizes for improved readability and consistency: - Affiliations: 10pt → 8pt - Corresponding authors section: 10pt → 8pt (all runs) - Figure/table/equation legends: 7pt → 8pt - DOI highlighting now respects hide_highlighting config These changes create a more consistent and professional appearance for manuscript submissions, especially when hide_highlighting is enabled for clean output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Code Review SummaryThis PR adds a valuable configuration option for DOCX export and improves the corresponding author display. The implementation is generally solid, but I've identified several areas for improvement. Strengths
Critical Issues1. Missing Test Coverage (HIGH)The new hide_highlighting feature has zero test coverage. Required tests:
2. Duplicate Code (MEDIUM)Lines 523-525 in docx_writer.py contain duplicate highlight_yellow checks. Remove the duplicate. 3. Missing Configuration Documentation (MEDIUM)The hide_highlighting option is not documented in MANUSCRIPT/00_CONFIG.yml 4. Corresponding Author Feature Lacks Tests (HIGH)The corresponding author indication feature has no dedicated tests for asterisk markers, correspondence section, or base64 email decoding. Security ConcernsBase64 email decoding (lines 296-301) uses broad exception handler. Use specific exception types and consider logging failures. Code Quality Issues
RecommendationsPriority 1 (Must Fix):
Priority 2 (Should Fix): Overall AssessmentThis is a useful feature that addresses a real need, but it needs test coverage before it can be safely merged. Per CLAUDE.md: Maintain comprehensive test coverage (aim for 80%) - This PR adds approximately 100 lines of new code with 0 test lines. Recommendation: Request changes to add tests before approval. |
Add new configuration option `docx.hide_comments` to control whether comments (both block and inline) are included in DOCX output. Changes: - Add hide_comments parameter to DOCXWriter.write() method - Skip comment sections when hide_comments is enabled - Skip inline comments in paragraphs, lists, and captions - Add hide_comments config loading in DOCXExporter - Default: false (include comments for backward compatibility) Configuration: ```yaml docx: hide_comments: false # Set to true to exclude all comments ``` Use Cases: - Clean submission: Set to true for journal submissions - Internal review: Set to false to keep comments for collaboration - Manuscript drafts: Keep comments during development phase 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Explicitly set all heading text to black RGB(0, 0, 0) instead of relying on style defaults. This ensures consistent black headings across all DOCX viewers and platforms. Changes: - Import RGBColor from docx.shared - Update _add_heading() to use RGBColor(0, 0, 0) - Update _add_snote_title() to use RGBColor(0, 0, 0) - Update "Figures" heading to use RGBColor(0, 0, 0) - Update "Bibliography" heading to use RGBColor(0, 0, 0) Previously, setting to None would reset to style defaults which could be non-black in some Word templates. Explicit RGB values ensure headings are always black regardless of template. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Code Review: PR 273 - Add hide_highlighting and hide_comments config options Overview: This PR adds two new DOCX export configuration options (hide_highlighting and hide_comments) along with font size improvements. The implementation is well-structured and maintains backward compatibility. STRENGTHS:
CRITICAL ISSUES:
MEDIUM PRIORITY: RECOMMENDATIONS: OVERALL ASSESSMENT: Great work on the implementation! Once the merge conflict is resolved and tests are added, this will be ready to merge. |
Code Review: PR #273 - Add hide_highlighting and hide_comments config optionsOverall AssessmentThis is a well-structured PR that adds valuable flexibility to DOCX export. The implementation follows the project's conventions and maintains backward compatibility. However, there are several issues that should be addressed before merging. Critical Issues1. Duplicate Code in List ProcessingLocation: src/rxiv_maker/exporters/docx_writer.py:509-513 Lines 509-513 contain duplicate code that applies yellow highlighting twice. Remove lines 512-513. 2. Missing Test CoverageNo tests were added for the new hide_highlighting and hide_comments features. The PR mentions passing 11 tests but doesn't verify the new functionality works correctly. Recommendation: Add integration tests to verify highlighting can be disabled and comments can be hidden. Design Issues3. Potential Bug: Authors List ScopeLocation: src/rxiv_maker/exporters/docx_writer.py:281-282 The corresponding_authors code references the authors variable outside its scope block. This could fail if authors is not defined in metadata. Fix: Move inside the if authors: block or add a guard. 4. Inconsistent Highlight Handling in HyperlinksLocation: src/rxiv_maker/exporters/docx_writer.py:517 Line 517 hardcodes highlight=True instead of respecting hide_highlighting setting. Compare with bibliography DOI handling on line 163 which correctly uses highlight=not self.hide_highlighting. 5. Base64 Import Added But Not DocumentedLocation: src/rxiv_maker/exporters/docx_writer.py:7 Line 7 adds import base64 for email decoding, but this appears unrelated to the hide_highlighting/hide_comments features. This seems like scope creep. Question: Is the base64 email decoding part of a different feature? Code Quality Suggestions6. Font Size Changes Not in Feature ScopeThe PR includes font size changes unrelated to highlighting/comments:
Consider splitting these into a separate PR for easier review. 7. Code OrganizationThe hide_highlighting and hide_comments instance variables are set in write() but could be initialized in init() for clarity. What's Done Well
Security and Performance
Checklist for AuthorBefore merging, please address:
Final RecommendationStatus: Needs Revision This PR adds valuable functionality but requires fixes before merging. The duplicate code and missing tests are blockers. Once addressed, this will be a solid addition to the DOCX export capabilities. Great work on the clean abstraction with _apply_highlight() - that's exactly the right approach! |
Extract commonly duplicated code into shared utility modules for better maintainability and code reuse. This is Phase 1 of code centralization between DOCX export and LaTeX/PDF generation. New shared utility modules: 1. **utils/accent_character_map.py** (~145 lines) - Centralize 60+ LaTeX accent character mappings - Function: clean_latex_accents() - Reduces duplication from docx_helpers.py 2. **utils/comment_filter.py** (~50 lines) - Centralize metadata comment filtering logic - Function: is_metadata_comment() - Extracted from docx_content_processor.py - Now available for LaTeX processing 3. **utils/citation_range_formatter.py** (~120 lines) - Centralize citation range formatting ([1,2,3] → [1-3]) - Functions: format_number_list(), format_citation_ranges() - Extracted from docx_citation_mapper.py - Now available for LaTeX processing Changes to existing modules: - docx_helpers.py: Use clean_latex_accents() from new module (~112 lines removed) - docx_content_processor.py: Use is_metadata_comment() from new module (~18 lines removed) - docx_citation_mapper.py: Use format_citation_ranges() from new module (~94 lines removed) Impact: - ~224 lines of code eliminated through centralization - Three new shared utilities available for LaTeX processing - All 11 DOCX integration tests pass ✅ - All 35 DOCX unit tests pass ✅ See analysis in conversation for additional centralization opportunities identified for Phase 2 (author processing, label extraction, SI loading). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Code Review - PR 273 Thank you for this well-structured PR adding flexibility to DOCX export. STRENGTHS:
HIGH PRIORITY ISSUES:
MEDIUM PRIORITY:
TEST COVERAGE: OVERALL ASSESSMENT: Score: 8/10 - Excellent implementation, needs test coverage. Great work! These options will be very useful for manuscript submission. |
Create centralized utility for extracting and mapping reference labels (figures, tables, equations, supplementary elements) from manuscript content. New module: utils/label_extractor.py - LabelExtractor class with methods for each label type - extract_figure_labels() - Main figures - extract_supplementary_figure_labels() - Supplementary figures - extract_supplementary_table_labels() - Supplementary tables (both markdown and LaTeX formats) - extract_supplementary_note_labels() - Supplementary notes - extract_equation_labels() - Equations - extract_all_labels() - Convenience method for all types Changes to docx_exporter.py: - Replace inline regex patterns with LabelExtractor calls - Reduces ~40 lines of duplicated regex logic - Maintains exact same behavior and label ordering Benefits: - Single source of truth for label extraction patterns - Available for LaTeX processing to use - Easier to test label extraction independently - Consistent label handling across formats Testing: - All 11 DOCX integration tests pass ✅ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
This is a well-structured PR with excellent refactoring. However, it requires unit tests for the 4 new utility modules and integration tests for the new config options before merging. See full review details in next comment. |
|
STRENGTHS:
|
|
CRITICAL ISSUES - Missing test coverage for 4 new utility modules. Required: unit tests for accent_character_map, citation_range_formatter, comment_filter, and label_extractor. Also need integration tests for hide_highlighting and hide_comments config options. Missing documentation in CHANGELOG.md and user-facing docs. |
|
CODE QUALITY ISSUES:
|
|
RECOMMENDATIONS: Must Fix Before Merge:
Should Fix: Testing: Run nox -s test, nox -s test-unit, nox -s lint, nox -s type-check |
|
FINAL VERDICT: Status: Needs Work (primarily tests and documentation) Ratings:
Summary: The code implementation is solid and follows best practices. The refactoring significantly improves maintainability. However, the lack of unit tests for approximately 500 lines of new utility code is a significant concern that must be addressed before merging. Once tests and documentation are added, this will be an excellent contribution. Great work on the refactoring and feature additions! The code quality is high. |
Add visual indication and explanatory note for co-first authors in DOCX exports, matching the functionality available in LaTeX/PDF generation. Changes: - Add dagger (†) superscript marker next to co-first author names - Add "These authors contributed equally" note section after affiliations - Support co_first_author field from author metadata - Maintain consistency with LaTeX approach (which uses * marker) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Extract author and affiliation mapping logic into a shared utility module, reducing duplication between DOCX and LaTeX generation code. Changes: - Create utils/author_affiliation_processor.py with AuthorAffiliationProcessor class - Refactor docx_writer.py to use centralized processor - Remove ~20 lines of duplicate affiliation mapping code from DOCX writer - Maintain identical output behavior (all tests pass) The processor handles: - Building affiliation shortname → number mappings - Looking up full affiliation details from metadata - Categorizing authors (co-first, corresponding) - Ordering affiliations by first appearance Part of Phase 2 code centralization between DOCX and PDF/LaTeX generation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Pull Request Review: DOCX Export Enhancements🎯 Overall AssessmentThis is a well-structured, high-quality PR that significantly improves the DOCX export functionality. The code demonstrates excellent software engineering practices with meaningful refactoring, new features, and strong attention to maintainability. Summary:
🟢 Strengths1. Excellent Code Centralization ⭐⭐⭐⭐⭐The creation of shared utility modules is a major win:
Impact: ~224 lines eliminated from duplicated code, now available for LaTeX/PDF generation to use. This is textbook DRY (Don't Repeat Yourself) principle in action. 2. Clean API DesignNew configuration options are well-designed: docx:
hide_highlighting: false # Control colored highlights
hide_comments: false # Control comment inclusion
3. Feature CompletenessNew features fill important gaps:
4. Documentation QualityAll new utility modules include:
5. Proper AbstractionThe 🟡 Areas for Improvement1. Missing Unit Tests for New Utilities
|
… DOCX Make DOCX exports more consistent and professional with improved typography: - Set Arial as default font for entire document (Normal style + all heading styles) - Set affiliation numbering to 8pt to match affiliation text size Changes: - Add _set_default_font() method to configure document-wide font - Set Arial on Normal style (base for all content) and all heading styles - Fix affiliation number superscripts to use 8pt font size This ensures consistent typography throughout the DOCX export. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Comprehensive PR Review: DOCX Export EnhancementsThank you for this comprehensive PR! This is excellent work that significantly improves code quality through centralization while adding useful new features. Here's my detailed review: ✅ Strengths1. Excellent Code Centralization ⭐The extraction of shared logic into utility modules is exemplary:
This follows the DRY principle perfectly and creates a single source of truth for cross-format operations. 2. Backward CompatibilityThe new configuration options have sensible defaults: hide_highlighting: false # Maintains existing behavior
hide_comments: false # Maintains existing behavior3. Feature CompletenessThe corresponding author and co-first author support is well-implemented:
4. Formatting Improvements
|
Fix init command test failures in CI by using sys.executable instead of relying on 'rxiv' being in PATH. This ensures tests work correctly in nox virtual environments where the rxiv command may not be directly accessible via subprocess calls. Root cause: - Tests were calling subprocess.run(["rxiv", "init", ...]) - In CI nox environments, 'rxiv' command not always in PATH for subprocesses - Caused "Error: Directory" truncated error message Solution: - Use [sys.executable, "-m", "rxiv_maker.cli"] for all subprocess calls - Ensures we use the same Python environment running the tests - Works reliably in both local development and CI environments This fixes pre-existing CI failures on main branch where all init tests were failing with the same error. Tested: - All 18 init command tests pass locally - Should fix Coverage Validation job failures on CI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Code Review: DOCX Export EnhancementsThank you for this comprehensive PR! This is a well-structured refactoring that brings significant improvements to code quality and maintainability. Here's my detailed review: ✅ Strengths1. Excellent Code Centralization
2. Clean Implementation
3. Backward Compatibility
4. Professional Formatting Improvements
|
Root cause: InitCommand.setup_common_options was reading from the MANUSCRIPT_PATH environment variable when manuscript_path argument was None. This caused test isolation issues where one test would set the environment variable and all subsequent init tests would try to create manuscripts in the same location instead of their own temp directories. Solution: Remove EnvironmentManager.get_manuscript_path() from init command. The MANUSCRIPT_PATH environment variable is meant for finding existing manuscripts (build/pdf commands), not for determining where to initialize new manuscripts. Init should only use: 1. Explicitly provided manuscript_path argument, OR 2. Default to "MANUSCRIPT" in current working directory Testing: All 25 init command tests now pass locally. Fixes test failures in CI where init tests were getting: "Directory '/tmp/pytest-of-runner/pytest-0/test_get_debug_info0/manuscript' already exists" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Code Review - PR #273: DOCX Export EnhancementsOverviewThis is an excellent PR that makes substantial improvements to the DOCX export functionality. The code centralization effort is particularly valuable, and the new configuration options enhance flexibility. Overall, the changes are well-structured and follow good software engineering practices. ✅ Strengths1. Excellent Code CentralizationThe creation of 5 shared utility modules is a major win:
This reduces duplication, improves maintainability, and sets the stage for future export formats. 2. Clean API DesignThe utility modules have clear, focused responsibilities with:
3. Backward CompatibilityAll new config options default to preserving existing behavior:
4. Professional Typography Improvements
5. Enhanced Author Metadata
|
Version bump: 1.17.0 → 1.18.0 This release includes comprehensive DOCX export improvements: - New configuration options (hide_highlighting, hide_comments) - Co-first and corresponding author support with proper markers - Professional typography (Arial font, standardized sizing) - Five new centralized utility modules for code reusability - Fixed init command environment variable and CI compatibility issues See CHANGELOG.md for full details. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Comprehensive review of PR 273 completed. Overall this is an excellent contribution with high code quality sound architecture and thorough documentation. Major strengths include outstanding code centralization with 5 shared utility modules that eliminate approximately 100 lines of duplicate code excellent documentation with comprehensive docstrings backward compatibility maintained and professional typography improvements. Critical issue identified is missing unit tests for the 5 new utility modules which are required per project standards to meet the 80 percent coverage goal. Required test files needed for accent_character_map comment_filter citation_range_formatter label_extractor and author_affiliation_processor. Minor suggestions include using more specific exception handling compiling regex patterns at class level and adding config validation. Security review shows no concerns. Architecture is excellent with utilities in correct location following Single Responsibility Principle with no circular dependencies. Documentation is excellent with CHANGELOG following format perfectly. Integration shows excellent refactoring reducing docx_citation_mapper from 100 lines to 20 lines and docx_helpers from 117 lines to 4 lines. Final recommendation is approve with changes requested specifically adding unit tests for 5 new utility modules verifying coverage meets 80 percent threshold and running full test suite. Overall high quality work with excellent architecture and refactoring that with unit tests added will be an exemplary contribution. Impact includes 100 lines duplicate code removed improved maintainability new features for co-first and corresponding authors and no breaking changes but missing unit tests must be addressed. Reviewed per CLAUDE.md standards. |
|
DETAILED CODE REVIEW BREAKDOWN STRENGTHS
CRITICAL REQUIREMENT MISSING UNIT TESTS MINOR IMPROVEMENTS SUGGESTED
SECURITY REVIEW PASSED ARCHITECTURE EXCELLENT REFACTORING IMPACT TESTING CHECKLIST BEFORE MERGE
OVERALL ASSESSMENT |
The sample_changelog fixture needs to include v1.18.0 as the current version for the test_changelog_default_shows_current_version test to pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Comprehensive Code Review: PR #273Executive SummaryThis PR represents excellent software engineering with significant improvements to the DOCX export system. The code centralization, new features, and attention to backward compatibility are commendable. However, there are important considerations around test coverage for the new utility modules. Strengths1. Outstanding Code CentralizationThe creation of 5 shared utility modules is a textbook example of DRY principles:
Impact: Reduces bug surface area and ensures DOCX and PDF/LaTeX exports behave identically. 2. Clean API DesignNew configuration options follow existing patterns - consistent naming, sensible defaults, backward compatible. 3. Professional Typography
4. Comprehensive Author Metadata Support
5. Well-Documented Changes
Issues and Concerns1. CRITICAL: Missing Test Coverage for New UtilitiesThe 5 new utility modules have zero dedicated unit tests. I searched and found none importing these modules. What is needed:
Why this matters: While integration tests pass, utility modules should have direct unit tests to document expected behavior, catch regressions, and verify edge cases. 2. Potential Bug: Email Decoding Error HandlingIn docx_writer.py lines 245-251, base64 email decoding has bare except clause catching ALL exceptions with silent failure and no logging. Recommendation: Use specific exceptions and add logging 3. Missing Tests for New Configuration OptionsNo tests verify hide_highlighting and hide_comments work correctly 4. Inconsistent Logic in LabelExtractorextract_supplementary_table_labels() has complex duplicate removal using side effects. No other methods do this - why only tables? 5. Other Minor Issues
Testing AssessmentCurrent: All 11 DOCX integration tests pass Required before merge:
Final RecommendationVerdict: Approve with Conditions This is high-quality work that significantly improves the codebase. Code centralization is exemplary. However, missing test coverage is a blocker. Questions
Overall: Excellent work! Address the test coverage and this will be a stellar contribution. |
- Update test_changelog_breaking_only_flag to use --recent 7 (was 6) since v1.12.0 with BREAKING changes is now the 7th version after adding v1.18.0 to the fixture - Update test_changelog_since_with_no_versions to use v1.18.0 (was v1.17.0) to ensure no versions found after the latest version 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Code Review: PR 273 - DOCX Export Enhancements This is an excellent, well-structured PR. Overall Assessment: APPROVED Strengths:
CRITICAL Issue: Missing Unit Tests Code Quality Issues:
Performance:
Security:
Documentation Gaps:
Recommendation: Approve with follow-up for unit tests. The refactoring is excellent but unit tests should be added in immediate next PR. |
This PR includes comprehensive improvements to DOCX export:
New Features
Configuration Options
docx.hide_highlighting: Control colored highlighting of references (default: false)docx.hide_comments: Control comment inclusion in output (default: false)Author Metadata Support
emailand base64-encodedemail64fieldsTypography & Formatting Improvements
Code Centralization (Phase 1 & 2)
Created 5 shared utility modules to reduce duplication between DOCX and PDF/LaTeX generation:
Phase 1 - Text Processing Utilities
utils/accent_character_map.py: LaTeX accent → Unicode conversion (60+ mappings)
utils/comment_filter.py: Metadata comment filtering
utils/citation_range_formatter.py: Citation range formatting
Phase 2 - Structural Processing Utilities
utils/label_extractor.py: Label extraction for cross-references (~195 lines)
{#fig:label}) and LaTeX (\\label{stable:label}) formatsutils/author_affiliation_processor.py: Author and affiliation processing (~130 lines)
Impact
Testing
Files Modified
Core DOCX Export:
src/rxiv_maker/exporters/docx_writer.py- Added co-first author support, integrated processors, set default font, fixed font sizessrc/rxiv_maker/exporters/docx_exporter.py- Added config options, used label extractorsrc/rxiv_maker/exporters/docx_content_processor.py- Uses comment filtersrc/rxiv_maker/exporters/docx_helpers.py- Uses accent mapsrc/rxiv_maker/exporters/docx_citation_mapper.py- Uses citation range formatterNew Utility Modules:
src/rxiv_maker/utils/accent_character_map.pysrc/rxiv_maker/utils/comment_filter.pysrc/rxiv_maker/utils/citation_range_formatter.pysrc/rxiv_maker/utils/label_extractor.pysrc/rxiv_maker/utils/author_affiliation_processor.py