Bug: Collision between literal character 'K' and internal "Keep" edit

# Description 

When converting a source string and a target string into a list of edits, using the character 'K' in the target string  can cause a formatting error.
This results in incorrect edit strings for `word-edits-append` and `subword-edits-append`.

# Steps to Reproduce

```python
from tokenizer import Tokenizer
from alignment.aligner import word_level_alignment, char_level_alignment
from create_edits import create_edits

tokenizer = Tokenizer("bert-base-uncased")

src_sent = "test"
tgt_sent = "K test"

word_level_align = word_level_alignment(src_sent=src_sent, tgt_sent=tgt_sent)
char_level_align = char_level_alignment(word_level_align)

edits = create_edits(char_level_align, word_level_align, tokenizer)
print(edits['word-edits-append'])
print(edits['subword-edits-append'])
```

The output for the append edits incorrectly duplicates the 'K' inside the brackets:

```json
[{
  "subword": "test",
  "raw_subword": "test",
  "edit": "A_[KKKK]KKKK"
}]
[{
  "subword": "test",
  "raw_subword": "test",
  "edit": "A_[KKKK]KKKK"
}]
```

Both of the outputs have a `A_[KKKK]KKKK` edit where we would expect a  `A_[K]KKKK` edit. 

# Proposed Solution

It looks like this is coming from the insert_to_append() function in edits/utils.py. I will open a PR with a fix for this shortly!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Collision between literal character 'K' and internal "Keep" edit #1

Description

Steps to Reproduce

Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: Collision between literal character 'K' and internal "Keep" edit #1

Description

Description

Steps to Reproduce

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions