MFB-62: readability check by SonaliBedge · Pull Request #1337 · MyFriendBen/benefits-api

SonaliBedge · 2026-01-13T00:36:40Z

Context & Motivation

Fixes : #MFB-62
Related PR: None
Old PR : #1855
- Implements translation readability checking to ensure content is accessible to users with varying reading levels
- Goal: English content at or below 8th grade reading level, Spanish content scoring 60+ on Fernández-Huerta

Changes Made

New management command: python manage.py check_readability to analyze translation text readability
Readability metrics:
English: Flesch-Kincaid Grade Level (target ≤ 8.0)
Spanish: Fernández-Huerta score (target ≥ 60)
Add readability command output files to .gitignore

Commands:

Check all English translations

python manage.py check_readability --language en-us

Check English for specific white-label (Colorado)

python manage.py check_readability --language en --whitelabel co

Check Spanish translations

python manage.py check_readability --language es --whitelabel co

With custom threshold (6th grade level)

python manage.py check_readability --language en --whitelabel co --threshold 6

Show detailed scores for each translation

python manage.py check_readability --language en --whitelabel co --detailed

CI/CD mode - exit with error code if failures found

python manage.py check_readability --language en --whitelabel co --fail-on-error

Also show passing translations

python manage.py check_readability --language en --whitelabel co --show-passing

Command options:

[--language] / -l: Language code (en-us, es)
[--whitelabel] / -w: Filter by white label (e.g., co, ma, nc)
[--threshold] / -t: Custom threshold override
[--detailed]: Show all readability scores
[--show-passing]: Include passing translations in output
--fail-on-error: Exit with error code for CI/CD integration
--min-words: Minimum word count for analysis (default: 10)
[--label-filter]: Filter by label prefix
New dependency: Added [textstat>=0.7.3] to [requirements.txt]
Unit tests: Added comprehensive tests in [test_check_readability.py]
Documentation: Updated README with usage examples

Testing

Migrations to run: None
Configuration updates needed: None
Environment variables/settings to add: None
Manual testing steps:

Install dependencies

pip install -r requirements.txt

Download NLTK data (first time only)

python -c "import nltk; nltk.download('cmudict')"

Run readability check

python manage.py check_readability --language en --whitelabel co

Run unit tests

pytest translations/management/commands/tests/test_check_readability.py -v

Deployment

Post-deployment scripts needed: None
Production config updates: None
Admin updates needed: None
Notify team/users of: check_readability command available for analyzing translation content

Notes for Reviewers

Known limitations:

Very short texts (<10 words) are skipped as readability metrics are unreliable for them
NLTK's cmudict data must be downloaded for accurate syllable counting
Future considerations:

Add GitHub Actions workflow (.github/workflows/readability.yml) for automated PR checks
Consider adding pre-commit hook for local validation
May want to adjust thresholds after initial content review period

Summary by CodeRabbit

New Features
- Adds a readability check for translations (English & Spanish) with language-aware metrics, thresholds, filtering, reporting, and optional fail-on-error.
Documentation
- Adds a "Readability Check" section describing workflow, metrics, CLI options, thresholds, and export formats.
Tests
- Adds comprehensive tests for readability analysis, thresholds, CLI options, filtering, outputs, and failure behavior.
Chores
- Adds runtime dependency: textstat>=0.7.3.

coderabbitai · 2026-01-13T00:36:51Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Note

`.coderabbit.yaml` has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)

Validation error: Unrecognized key(s) in object: 'base_branches'

⚙️ Configuration instructions

Please see the configuration documentation for more information.
You can also validate your configuration using the online YAML validator.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

📝 Walkthrough

Walkthrough

Adds a README section describing a readability check, a new dependency (textstat), a Django management command check_readability implementing English/Spanish readability analysis and reporting, and unit tests for the checker and command. No existing public APIs modified.

Changes

Cohort / File(s)	Summary
Documentation `README.md`	Appended a "Readability Check" section with workflow, metrics (English: Flesch‑Kincaid; Spanish: Fernández‑Huerta), CLI usage, flags, and examples.
Dependencies `requirements.txt`	Added dependency: `textstat>=0.7.3`.
Readability command implementation `translations/management/commands/check_readability.py`	New Django management command adding `ReadabilityResult` (dataclass), `ReadabilityChecker` (English/Spanish analysis, word counting, thresholds, min-word bypass), and `Command` with CLI options, whitelabel filtering, formatted summary, and JSON/CSV/text export helpers.
Tests `translations/management/commands/tests/__init__.py`, `translations/management/commands/tests/test_check_readability.py`	Added tests for `ReadabilityChecker` and command behavior (language handling, thresholds, short/empty texts, whitelabel filtering, --fail-on-error, output assertions).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'MFB-62: readability check' directly corresponds to the main objective of implementing a readability checking feature for translations.
Docstring Coverage	✅ Passed	Docstring coverage is 91.18% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The pull request description comprehensively covers all required template sections with concrete details about motivation, changes, testing steps, and deployment considerations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch sonali/mfb-readability-check

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can enable review details to help with troubleshooting, context usage and more.

Enable the reviews.review_details setting to include review details such as the model used, the time taken for each step and more in the review comments.

codecov · 2026-01-13T00:39:19Z

Codecov Report

❌ Patch coverage is 83.89423% with 67 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...nslations/management/commands/check_readability.py	74.23%	51 Missing and 16 partials ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @translations/management/commands/check_readability.py:
- Around line 261-273: The translation_fields list in
translations/management/commands/check_readability.py is out of sync with the
Program model: remove the non-existent fields name_short and category, replace
value_description with the correct Program field name(s) (use estimated_value
and/or value_type as defined on Program), and add the missing fields
estimated_value and website_description; update the translation_fields array
accordingly and verify the final names exactly match the Program model's field
names.

🧹 Nitpick comments (6)

translations/management/commands/check_readability.py (3)
180-184: Consider passing min_words to the constructor instead of mutating the class attribute.

Setting checker.MIN_WORD_COUNT = min_words on the instance works but is unconventional. Passing it as a constructor parameter would be cleaner.
Suggested improvement
 class ReadabilityChecker:
-    def __init__(self, en_threshold: Optional[float] = None, es_threshold: Optional[float] = None):
+    def __init__(self, en_threshold: Optional[float] = None, es_threshold: Optional[float] = None, min_word_count: Optional[int] = None):
         self.en_threshold = en_threshold if en_threshold is not None else self.ENGLISH_THRESHOLD
         self.es_threshold = es_threshold if es_threshold is not None else self.SPANISH_THRESHOLD
+        self.min_word_count = min_word_count if min_word_count is not None else self.MIN_WORD_COUNT
Then in check(), use self.min_word_count instead of self.MIN_WORD_COUNT.
305-310: Remove extraneous f prefix from strings without placeholders.

Lines 306 and 309 use f-strings but contain no placeholders.
Suggested fix
         if language.startswith("es"):
-            self.stdout.write(f"Metric: Fernández-Huerta (higher is better)")
+            self.stdout.write("Metric: Fernández-Huerta (higher is better)")
             self.stdout.write(f"Threshold: >= {checker.es_threshold}")
         else:
-            self.stdout.write(f"Metric: Flesch-Kincaid Grade Level (lower is better)")
+            self.stdout.write("Metric: Flesch-Kincaid Grade Level (lower is better)")
             self.stdout.write(f"Threshold: <= {checker.en_threshold} (8th grade)")
355-358: Remove unused style variable.

The style variable is assigned but never used.
Suggested fix
     def _print_result(self, result: ReadabilityResult, detailed: bool, is_failing: bool):
         """Print details for a single translation result."""
-        style = self.style.ERROR if is_failing else self.style.SUCCESS
         icon = "❌" if is_failing else "✅"
translations/management/commands/tests/test_check_readability.py (2)
159-164: Use _ for unused lambda parameter.

The self parameter in the lambda is unused. Using _ is the Python convention for ignored parameters and silences the static analysis warning.
Suggested fix (apply to all similar occurrences)
-        mock_queryset.__iter__ = lambda self: iter([])
+        mock_queryset.__iter__ = lambda _: iter([])
Same change applies to lines 178, 194, 224, and 251.
155-166: Consider extracting mock queryset setup to reduce duplication.

The mock queryset setup pattern is repeated 5 times across the test class. A helper method or pytest fixture would improve maintainability.
Example helper method
def _create_mock_queryset(self, translations=None):
    """Create a mock queryset for Translation.objects."""
    mock_queryset = MagicMock()
    mock_queryset.filter.return_value = mock_queryset
    mock_queryset.prefetch_related.return_value = mock_queryset
    translations = translations or []
    mock_queryset.count.return_value = len(translations)
    mock_queryset.__iter__ = lambda _: iter(translations)
    return mock_queryset
requirements.txt (1)
95-95: Use exact version pinning to match other dependencies.

All other dependencies use == for exact version pinning, but textstat uses >=0.7.3. This inconsistency could result in non-reproducible builds if textstat releases a breaking change.
Suggested fix
-textstat>=0.7.3
+textstat==0.7.3

📜 Review details

Configuration used: Repository: MyFriendBen/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e3a590 and fff14c8.

📒 Files selected for processing (5)

README.md
requirements.txt
translations/management/commands/check_readability.py
translations/management/commands/tests/__init__.py
translations/management/commands/tests/test_check_readability.py

🧰 Additional context used

🧬 Code graph analysis (2)

translations/management/commands/tests/test_check_readability.py (1)

translations/management/commands/check_readability.py (5)

ReadabilityResult (26-36)

analyze_english (61-71)

analyze_spanish (73-80)

check (88-136)

get_word_count (82-86)

translations/management/commands/check_readability.py (3)

translations/models.py (1)

Translation (167-304)

programs/models.py (1)

Program (597-773)

screener/models.py (1)

WhiteLabel (12-19)

🪛 Ruff (0.14.10)

translations/management/commands/tests/test_check_readability.py

163-163: Unused lambda argument: self

(ARG005)

178-178: Unused lambda argument: self

(ARG005)

194-194: Unused lambda argument: self

(ARG005)

224-224: Unused lambda argument: self

(ARG005)

251-251: Unused lambda argument: self

(ARG005)

translations/management/commands/check_readability.py

169-169: Unused method argument: args

(ARG002)

196-196: Avoid specifying long messages outside the exception class

(TRY003)

239-239: Avoid specifying long messages outside the exception class

(TRY003)

306-306: f-string without any placeholders

Remove extraneous f prefix

(F541)

309-309: f-string without any placeholders

Remove extraneous f prefix

(F541)

357-357: Local variable style is assigned to but never used

Remove assignment to unused variable style

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run Tests with Coverage

🔇 Additional comments (8)

translations/management/commands/tests/__init__.py (1)

1-1: LGTM!

Standard package initializer for the test module.

translations/management/commands/check_readability.py (4)

1-23: LGTM!

Well-documented module with clear usage examples in the docstring. Imports are properly organized.

25-37: LGTM!

Clean dataclass design with appropriate type annotations for all fields.

61-80: Thread safety concern with textstat.set_lang().

textstat.set_lang() modifies global state. If this command is ever run concurrently (e.g., in a multi-threaded context or parallel test execution), there could be race conditions where one thread's language setting affects another's analysis.

For a Django management command run sequentially, this is acceptable, but worth noting for future parallel usage.

142-168: LGTM!

CLI arguments are well-defined with clear help text and sensible defaults.

translations/management/commands/tests/test_check_readability.py (2)

22-145: LGTM!

Comprehensive test coverage for ReadabilityChecker including edge cases for empty, None, and short texts. Tests are well-structured and clearly named.

260-282: LGTM!

Adequate test coverage for the ReadabilityResult dataclass.

README.md (1)

34-65: LGTM!

Clear and comprehensive documentation for the new readability check feature. The examples cover all major use cases and the explanation of metrics is helpful.

translations/management/commands/check_readability.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @translations/management/commands/check_readability.py:
- Around line 115-126: The check method currently falls back to English for any
non-Spanish language; add explicit handling for Vietnamese by checking if
lang.startswith("vi") before the else branch, and when detected write a clear
warning (e.g., via self.stdout.write or self.stderr.write) that Vietnamese is
unsupported and skip analysis (set scores/primary_score/threshold to None or a
sentinel and mark passes as False/skip), leaving analyze_spanish and
analyze_english unchanged; ensure any code that consumes primary_score/threshold
handles the sentinel, and update the command help text or docs to reflect that
vi is unsupported if you prefer not to analyze it.

🧹 Nitpick comments (4)

translations/management/commands/check_readability.py (4)

180-184: Mutating class constant via instance attribute assignment is confusing.

Line 184 assigns to checker.MIN_WORD_COUNT, which shadows the class-level constant with an instance attribute. While this works, it's an unusual pattern that may confuse future maintainers.

♻️ Suggested refactor: Accept min_words as a constructor parameter

-    def __init__(self, en_threshold: Optional[float] = None, es_threshold: Optional[float] = None):
+    def __init__(
+        self,
+        en_threshold: Optional[float] = None,
+        es_threshold: Optional[float] = None,
+        min_word_count: Optional[int] = None,
+    ):
         self.en_threshold = en_threshold if en_threshold is not None else self.ENGLISH_THRESHOLD
         self.es_threshold = es_threshold if es_threshold is not None else self.SPANISH_THRESHOLD
+        self.min_word_count = min_word_count if min_word_count is not None else self.MIN_WORD_COUNT

Then use self.min_word_count instead of self.MIN_WORD_COUNT in the check method, and update the Command:

         checker = ReadabilityChecker(
             en_threshold=threshold if language.startswith("en") else None,
             es_threshold=threshold if language.startswith("es") else None,
+            min_word_count=min_words,
         )
-        checker.MIN_WORD_COUNT = min_words

169-177: Consider validating the language option against supported languages.

The language option accepts any string, but only en-* and es-* are meaningfully supported. Invalid language codes could lead to unexpected behavior (e.g., translation.set_current_language() with an unsupported code).

♻️ Suggested validation

+        SUPPORTED_LANGUAGES = {"en-us", "es", "vi"}
         language = options["language"]
+        if language not in SUPPORTED_LANGUAGES and not any(language.startswith(p) for p in ("en", "es")):
+            self.stderr.write(self.style.WARNING(
+                f"Warning: '{language}' may not be fully supported. Supported: en-us, es, vi"
+            ))

306-310: Remove extraneous f prefix from strings without placeholders.

Lines 306 and 309 use f-strings but have no placeholders.

♻️ Proposed fix

         if language.startswith("es"):
-            self.stdout.write(f"Metric: Fernández-Huerta (higher is better)")
+            self.stdout.write("Metric: Fernández-Huerta (higher is better)")
             self.stdout.write(f"Threshold: >= {checker.es_threshold}")
         else:
-            self.stdout.write(f"Metric: Flesch-Kincaid Grade Level (lower is better)")
+            self.stdout.write("Metric: Flesch-Kincaid Grade Level (lower is better)")
             self.stdout.write(f"Threshold: <= {checker.en_threshold} (8th grade)")

355-358: Remove unused style variable.

The style variable is assigned but never used.

♻️ Proposed fix

     def _print_result(self, result: ReadabilityResult, detailed: bool, is_failing: bool):
         """Print details for a single translation result."""
-        style = self.style.ERROR if is_failing else self.style.SUCCESS
         icon = "❌" if is_failing else "✅"

📜 Review details

Configuration used: Repository: MyFriendBen/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fff14c8 and dea94a1.

📒 Files selected for processing (1)

translations/management/commands/check_readability.py

🧰 Additional context used

🧬 Code graph analysis (1)

translations/management/commands/check_readability.py (3)

translations/models.py (1)

Translation (167-304)

programs/models.py (1)

Program (597-773)

screener/models.py (1)

WhiteLabel (12-19)

🪛 Ruff (0.14.10)

translations/management/commands/check_readability.py

169-169: Unused method argument: args

(ARG002)

196-196: Avoid specifying long messages outside the exception class

(TRY003)

239-239: Avoid specifying long messages outside the exception class

(TRY003)

306-306: f-string without any placeholders

Remove extraneous f prefix

(F541)

309-309: f-string without any placeholders

Remove extraneous f prefix

(F541)

357-357: Local variable style is assigned to but never used

Remove assignment to unused variable style

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run Tests with Coverage

🔇 Additional comments (5)

translations/management/commands/check_readability.py (5)

25-37: LGTM!

The dataclass is well-structured with appropriate fields for storing readability analysis results.

61-80: Global state in textstat.set_lang() is fine for sequential command execution.

The textstat.set_lang() call modifies global state. This works correctly for a sequential management command, but if this ReadabilityChecker class is ever used in a concurrent context (e.g., async views, threads), the language setting could be overwritten between set_lang() and the metric calls. Consider documenting this limitation or restructuring if concurrent use is planned.

241-281: LGTM!

The white label lookup with fallback from code to name is well-implemented. The translation field list correctly matches the Program model's translation ForeignKey fields based on the relevant code snippets.

314-323: Division by zero is guarded, but consider simplifying the ternary.

The percentage calculation is protected by if total > 0, but the nested ternary structure is slightly hard to follow. This is fine as-is, just noting for future readability.

344-351: Sorting logic for failing translations is correct.

English failures are sorted worst-first (highest grade level), and Spanish failures are sorted worst-first (lowest Fernández-Huerta score). This makes sense for prioritizing which translations need attention first.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

translations/management/commands/check_readability.py (2)
220-226: Consider handling missing language translations gracefully.

set_current_language() may not raise an exception, but accessing translation.text afterward could return None or the fallback language's text depending on Parler's configuration. The current if not text check on line 223 handles None, but you may want to verify that the translation actually exists for the requested language to avoid silently analyzing fallback text.
Optional: Explicit language check
             translation.set_current_language(language)
-            text = translation.text
+            
+            # Check if translation exists for this specific language
+            if not translation.has_translation(language):
+                skipped_count += 1
+                continue
+                
+            text = translation.text
263-287: Consider using select_related to reduce database queries.

The current implementation makes N+1 queries when iterating over programs - one query per translation field access. Using select_related can fetch all translation references in a single query.
Proposed optimization
         # Get all programs for this white label
-        programs = Program.objects.filter(white_label=wl)
+        programs = Program.objects.filter(white_label=wl).select_related(
+            "name",
+            "description",
+            "description_short",
+            "learn_more_link",
+            "apply_button_link",
+            "apply_button_description",
+            "value_type",
+            "estimated_delivery_time",
+            "estimated_application_time",
+            "estimated_value",
+            "website_description",
+        )

📜 Review details

Configuration used: Repository: MyFriendBen/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dea94a1 and a50e17f.

📒 Files selected for processing (1)

translations/management/commands/check_readability.py

🧰 Additional context used

🧬 Code graph analysis (1)

translations/management/commands/check_readability.py (3)

translations/models.py (1)

Translation (167-304)

programs/models.py (1)

Program (597-773)

screener/models.py (1)

WhiteLabel (12-19)

🪛 Ruff (0.14.10)

translations/management/commands/check_readability.py

175-175: Unused method argument: args

(ARG002)

202-202: Avoid specifying long messages outside the exception class

(TRY003)

245-245: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run Tests with Coverage

🔇 Additional comments (4)

translations/management/commands/check_readability.py (4)

1-23: LGTM!

Clear module docstring with usage examples, and appropriate imports for the functionality.

25-37: LGTM!

Clean dataclass definition with appropriate type hints for storing readability analysis results.

148-173: LGTM!

Well-defined CLI arguments with clear help text and sensible defaults.

289-377: LGTM!

Well-structured output formatting with appropriate handling of edge cases (zero total) and correct sorting logic for different language thresholds.

translations/management/commands/check_readability.py

SonaliBedge · 2026-01-13T17:38:07Z

@coderabbitai review

coderabbitai · 2026-01-13T17:38:16Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…ores in JSON /CSV/text.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@translations/management/commands/check_readability.py`:
- Around line 563-564: The passing-results sort in _print_summary currently
always sorts ascending by r.primary_score; update it to mirror _export_text and
_export_json by sorting with reverse=language.startswith("es") so Spanish
results are shown highest-first; locate the loop that iterates over passing (the
call to self._print_result within _print_summary) and change the sorted(...)
call to use key=lambda r: r.primary_score and reverse=language.startswith("es")
so ordering is consistent with _export_text/_export_json and _print_result.
- Line 344: In _export_report() remove the unused local variable total
(assignment "total = len(passing) + len(failing)") since it is never referenced;
update the function in translations/management/commands/check_readability.py by
deleting that line to eliminate the dead variable and any linter warnings.

🧹 Nitpick comments (2)

translations/management/commands/check_readability.py (2)
126-137: Unsupported languages still fall through to English analysis.

While the past review flagged this and the PR description was updated to remove Vietnamese, the code still uses an else branch that will apply English metrics to any non-Spanish language. If a user accidentally passes an unsupported language like --language vi or --language fr, they'll get misleading English readability scores.

Consider adding input validation in the handle() method to reject unsupported languages early:
SUPPORTED_LANGUAGES = {"en-us", "en", "es"}
if not any(language.startswith(prefix) for prefix in ("en", "es")):
    raise CommandError(f"Unsupported language: {language}. Supported: en-us, es")
245-266: Minor redundancy in word count checking.

The word count is checked twice: once inside checker.check() (line 114) which marks short texts as passing with empty scores, and again here (line 259) to skip them entirely. This works correctly but the logic is slightly scattered.

Consider either:

Having check() return a distinct status for "skipped" vs "passing", or

Checking word count only in the loop before calling check()

This is a minor observation—the current approach is functional.

translations/management/commands/check_readability.py

…y-check

… main into this branch.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

translations/management/commands/check_readability.py (1)

127-137: ⚠️ Potential issue | 🟠 Major

Reject unsupported --language values instead of silently falling back to English.

At Line 132, any non-es language is analyzed with English metrics, while custom threshold wiring at Line 217-Line 218 only applies for en*/es*. A typo like --language en_us or fr yields misleading scoring behavior instead of a clear error.

Suggested fix

@@
-        if lang.startswith("es"):
+        if lang.startswith("es"):
             scores = self.analyze_spanish(text)
             primary_score = scores["fernandez_huerta"]
             passes = primary_score >= self.es_threshold
             threshold = self.es_threshold
-        else:
+        elif lang.startswith("en"):
             scores = self.analyze_english(text)
             primary_score = scores["flesch_kincaid_grade"]
             passes = primary_score <= self.en_threshold
             threshold = self.en_threshold
+        else:
+            raise ValueError(f"Unsupported language for readability analysis: {lang}")
@@
-        language = options["language"]
+        language = options["language"].lower()
+        if not (language.startswith("en") or language.startswith("es")):
+            raise CommandError(
+                f"Unsupported --language '{language}'. Supported prefixes are 'en' and 'es'."
+            )

Also applies to: 155-160, 217-219

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@translations/management/commands/check_readability.py` around lines 127 -
137, The command currently treats any non-"es" language as English; update the
language validation in the check_readability command to explicitly accept only
languages starting with "en" or "es" and reject others with a clear error/exit
instead of falling back to English; locate the language handling where
analyze_spanish and analyze_english are invoked (references: analyze_spanish,
analyze_english, es_threshold, en_threshold) and replace the implicit else with
an explicit check (if lang.startswith("es") ... elif lang.startswith("en") ...
else -> raise CommandError/print error and exit), and apply the same strict
validation logic to the other language-handling blocks noted around the other
occurrences (the blocks referencing primary_score/en_threshold and
fernandez_huerta/es_threshold) so typos like "en_us" or unsupported codes
produce an error rather than silent fallback.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@translations/management/commands/check_readability.py`:
- Around line 225-230: The current white-label logic fetches all active
Translation rows and then filters in Python, causing N+1 FK hits; update the
initial queryset (the variable `translations`) to apply the white-label filter
in the database (e.g., add .filter(program__label__in=allowed_labels) or the
appropriate program field) before prefetch_related so only matching translations
are loaded, and modify `_get_whitelabel_translation_labels` to use
select_related('program') (or include the program join in its queryset) so you
don’t dereference program FKs per-row; ensure you keep
prefetch_related("translations") as needed but move filtering into the ORM to
avoid iteration-based filtering.

---

Duplicate comments:
In `@translations/management/commands/check_readability.py`:
- Around line 127-137: The command currently treats any non-"es" language as
English; update the language validation in the check_readability command to
explicitly accept only languages starting with "en" or "es" and reject others
with a clear error/exit instead of falling back to English; locate the language
handling where analyze_spanish and analyze_english are invoked (references:
analyze_spanish, analyze_english, es_threshold, en_threshold) and replace the
implicit else with an explicit check (if lang.startswith("es") ... elif
lang.startswith("en") ... else -> raise CommandError/print error and exit), and
apply the same strict validation logic to the other language-handling blocks
noted around the other occurrences (the blocks referencing
primary_score/en_threshold and fernandez_huerta/es_threshold) so typos like
"en_us" or unsupported codes produce an error rather than silent fallback.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: MyFriendBen/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d62295b8-f892-4181-9c2b-3c7efc1d6590

📥 Commits

Reviewing files that changed from the base of the PR and between 2254360 and 870897d.

📒 Files selected for processing (2)

requirements.txt
translations/management/commands/check_readability.py

🚧 Files skipped from review as they are similar to previous changes (1)

requirements.txt

translations/management/commands/check_readability.py

kbrey

@SonaliBedge Nice work, here are some suggestions for you, let me know what you think!

kbrey · 2026-03-12T20:24:33Z

README.md

 ```

 For detailed information about writing and maintaining integration tests, see [docs/INTEGRATION_TESTING.md](docs/INTEGRATION_TESTING.md).
+


Thanks for thinking about documentation! I think our README is pretty high level so instead you can add this command to the commands wiki.

kbrey · 2026-03-12T20:26:39Z

.gitignore

 .claude/projects/
 .claude/settings.local.json
+
+# Readability check output files


I would suggest to output these files in a folder, such as readability-output/ (or tmp/), so that they don't collide with potential future files and so that this isn't fragile to file name changes.

translations/management/commands/check_readability.py

…y-check

…ther suggested changes.

kbrey

@SonaliBedge great cleanups! A couple followup suggestions, let me know what you think.

kbrey · 2026-03-17T14:00:11Z

translations/management/commands/check_readability.py

-            self.stdout.write(self.style.SUCCESS("PASSING TRANSLATIONS:"))
-            self.stdout.write("-" * 60)
+        for line in lines:
+            if line.startswith("✅"):


Thanks for making this drier! But this approach seems a bit fragile to have conditionals based on the icons; if we ever remove those icons from the text this would silently break. What would you think about a small ReportData dataclass that captures the computed state (sorted results, counts, threshold, metric name) and a _build_report_data method that populates it. Each output method (_print_summary, _export_text, _export_json, _export_csv) then renders from that directly.

kbrey · 2026-03-17T16:43:17Z

translations/management/commands/check_readability.py

+
+            result = checker.check(translation.label, text, language)
+
+            if result.word_count < min_words:


If I'm understanding right, we check for minimum word count twice here, both in checker.check and laso here within _analyze. Seems like we could clean that up by checking in just one of those places?

…plication

SonaliBedge added 2 commits January 12, 2026 15:56

feat: Add readability check for translations

69385fd

Reformatted

fff14c8

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

translations/management/commands/check_readability.py Show resolved Hide resolved

Incorrect field names changed

dea94a1

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

Made small changes suggested my coderabbitai

a50e17f

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

translations/management/commands/check_readability.py Show resolved Hide resolved

SonaliBedge requested a review from kbrey January 14, 2026 19:01

Added file output with --output and --format options, and rounding sc…

efa1fea

…ores in JSON /CSV/text.

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

translations/management/commands/check_readability.py Outdated Show resolved Hide resolved

translations/management/commands/check_readability.py Outdated Show resolved Hide resolved

SonaliBedge added 2 commits January 20, 2026 16:42

Merge remote-tracking branch 'origin/main' into sonali/mfb-readabilit…

92e50c2

…y-check

Made small changes.

2254360

catonph changed the title ~~Sonali/mfb readability check~~ MFB-62: readability check Feb 17, 2026

SonaliBedge added 2 commits March 5, 2026 08:24

Merge remote-tracking branch 'origin/main' into sonali/mfb-readabilit…

d796839

…y-check

Made changes suggested my Coderabbatai and merged latest changes from…

870897d

… main into this branch.

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

translations/management/commands/check_readability.py Show resolved Hide resolved

Fix readability checker code quality issues flagged in review.

fc6b385

kbrey reviewed Mar 12, 2026

View reviewed changes

SonaliBedge added 4 commits March 12, 2026 17:46

Merge remote-tracking branch 'origin/main' into sonali/mfb-readabilit…

962ac7e

…y-check

Refactor handle() in check_readability into focused helper and made o…

67d440c

…ther suggested changes.

Reformatted check_readability file

72c7c2b

Made small change in comment statement.

2538574

SonaliBedge requested a review from kbrey March 13, 2026 17:44

kbrey reviewed Mar 17, 2026

View reviewed changes

refactor: add ReportData dataclass, extract helpers, fix min_words du…

0c97e95

…plication

SonaliBedge requested a review from kbrey March 17, 2026 18:25

		```

		For detailed information about writing and maintaining integration tests, see [docs/INTEGRATION_TESTING.md](docs/INTEGRATION_TESTING.md).


		result = checker.check(translation.label, text, language)

		if result.word_count < min_words:

Conversation

SonaliBedge commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context & Motivation

Changes Made

Commands:

Check all English translations

Check English for specific white-label (Colorado)

Check Spanish translations

With custom threshold (6th grade level)

Show detailed scores for each translation

CI/CD mode - exit with error code if failures found

Also show passing translations

Command options:

Testing

Install dependencies

Download NLTK data (first time only)

Run readability check

Run unit tests

Deployment

Notes for Reviewers

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

.coderabbit.yaml has unrecognized properties

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SonaliBedge commented Jan 13, 2026

Uh oh!

coderabbitai bot commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kbrey left a comment

Choose a reason for hiding this comment

Uh oh!

kbrey Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

kbrey Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kbrey left a comment

Choose a reason for hiding this comment

Uh oh!

kbrey Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

kbrey Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

SonaliBedge commented Jan 13, 2026 •

edited

Loading

coderabbitai bot commented Jan 13, 2026 •

edited

Loading

`.coderabbit.yaml` has unrecognized properties

codecov bot commented Jan 13, 2026 •

edited

Loading