Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 6, 2026

What does this changes

Fixes Thai final consonant classification in KhaveeVerifier.check_marttra() for poetry analysis. Ten test cases involving ำ, การันต์ (์), consonant clusters, and ไ/ใ+ย/ล patterns now return correct มาตรา (marttra) values.

What was wrong

  1. การันต์ (์) handling: handle_karun_sound_silence() removed incorrect number of characters - complex logic miscalculated cutoff positions
  2. ำ finals: Treated as open syllable (กา) instead of m-final (กม)
  3. Consonant clusters: Words like จักร (กร cluster) classified by ร alone, not recognizing cluster as k-final
  4. ไ/ใ + ย/ล ambiguity: No distinction between ย/ล as vowel component (ไก่ → กา) vs. true final consonant (ไกล → เกย)
  5. Typo: ", ณ" in consonant list

How this fixes it

Simplified การันต์ handling: Remove ์ and preceding character uniformly

# Before: Complex logic with consonant checks
# After:
word = word[:-2]  # รมย์ → รม, โยชน์ → โยช

Added special cases:

  • Check ำ first → return กม immediately
  • Detect กร/ขร/คร/ฆร clusters in 3+ char words → remove ร before classification

ไ/ใ + ย/ล disambiguation: Count consonants to determine if ย/ล is true final

def _has_true_final_yl(word):
    consonant_count = sum(1 for c in word if c in thai_consonants)
    return consonant_count >= 2 and word[-1] in ["ย", "ล"]

# ไก่ (1 consonant) → ย is vowel component → กา
# ไกล (2 consonants) → ล is true final → เกย

Code quality: Use imported thai_consonants constant, fix typo, remove redundant ล from กน list

Your checklist for this pull request

  • Passed code styles and structures
  • Passed code linting checks and unit test
Original prompt

This section details on the original issue you should resolve

<issue_title>bug: check_marttra("จักร") returns "กน"</issue_title>
<issue_description>### Description

check_marttra() method in pythainlp.khavee.KhaveeVerifier does not properly handel "กร" as a final consonant.

Expected results

check_marttra("จักร") returns "กก"

Current results

check_marttra("จักร") returns "กน"

See a run in PR #967 at https://github.com/PyThaiNLP/pythainlp/actions/runs/11641064036/job/32419112079?pr=967

Steps to reproduce

from pythainlp.khavee import KhaveeVerifier

kv = KhaveeVerifier()

kv.check_marttra("จักร")

PyThaiNLP version

5.0.4

More cases

All of these following test cases at

def test_check_marttra(self):
are failed.

        self.assertEqual(kv.check_marttra("ขำ"), "กม")
        self.assertEqual(kv.check_marttra("รมย์"), "กม")

        self.assertEqual(kv.check_marttra("ทัย"), "เกย")
        self.assertEqual(kv.check_marttra("ไทย"), "เกย")
        self.assertEqual(kv.check_marttra("ไกล"), "เกย")
        self.assertEqual(kv.check_marttra("ใกล้"), "เกย")

        self.assertEqual(kv.check_marttra("จักร"), "กก")

        self.assertEqual(kv.check_marttra("โยชน์"), "กด")

        self.assertEqual(kv.check_marttra("ยนต์"), "กน")
        self.assertEqual(kv.check_marttra("กรรณ"), "กน")
```</issue_description>

## Comments on the Issue (you are @copilot in this section)

<comments>
<comment_new><author>@wannaphong</author><body>
@kangkengkhadev @S2P2 Can you help this issue?</body></comment_new>
</comments>

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits January 6, 2026 10:48
- Fixed handle_karun_sound_silence to correctly remove only silent consonants marked by ์
- Fixed handling of ำ at the end to return กม (not กา)
- Fixed handling of consonant clusters with ร (e.g., กร, ขร, คร) in compound words
- Fixed handling of words with ไ/ใ vowels to distinguish true final consonants ย/ล from vowel components
- Fixed typo in consonant list (", ณ" -> "ณ")
- Uncommented all previously failing test cases

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>
- Remove duplicate ล from กน consonant list (already classified as เกย on line 282)
- Use imported thai_consonants constant instead of duplicating the string
- Improves code maintainability and eliminates redundancy

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>
- Moved has_true_final_yl from inline function to class method _has_true_final_yl
- Avoids recreating function object on every check_marttra call
- Improves performance and code organization

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 6, 2026

Copilot AI changed the title [WIP] Fix check_marttra method to correctly handle final consonant Fix check_marttra() Thai final consonant classification Jan 6, 2026
Copilot AI requested a review from wannaphong January 6, 2026 10:59
@coveralls
Copy link

Coverage Status

coverage: 52.597% (+0.1%) from 52.461%
when pulling 4019183 on copilot/fix-check-marttra-bug
into de6be60 on dev.

@wannaphong wannaphong marked this pull request as ready for review January 6, 2026 11:31
@wannaphong wannaphong merged commit 41ed2c8 into dev Jan 6, 2026
43 of 44 checks passed
@wannaphong wannaphong added this to the 5.3 milestone Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: check_marttra("จักร") returns "กน"

3 participants