[`CodeLlamaTokenizer`] Nit, update __init__ to make sure the AddedTokens are not normalized because they are special by ArthurZucker · Pull Request #27359 · huggingface/transformers

ArthurZucker · 2023-11-08T07:45:01Z

What does this PR do?

Bridges the gap between the slow and fast version follow the updates in #26570 (similar updates were done to Llama)

HuggingFaceDocBuilderDev · 2023-11-08T08:11:47Z

The documentation is not available anymore as the PR was closed or merged.

LysandreJik · 2023-11-09T08:35:12Z

 PRETRAINED_VOCAB_FILES_MAP = {
    "vocab_file": {
        "hf-internal-testing/llama-code-tokenizer": "https://huggingface.co/hf-internal-testing/llama-tokenizer/resolve/main/tokenizer.model",
+        "codellama/CodeLlama-34b-Instruct-hf": "https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf/resolve/main/tokenizer.model",


remove these three lines and we can merge :)

(These are just here for backwards-compatibility)

LysandreJik

great

…ens are not normalized because they are special (huggingface#27359) * make sure tokens are properly initialized for codellama slow * add m ore pretrained models * style * test more tokenizers checkpoints

make sure tokens are properly initialized for codellama slow

d7f572c

ArthurZucker marked this pull request as ready for review November 9, 2023 08:24

ArthurZucker requested a review from LysandreJik November 9, 2023 08:25

ArthurZucker added 2 commits November 9, 2023 09:26

add m ore pretrained models

5b147d3

style

4d7ae8c

LysandreJik approved these changes Nov 9, 2023

View reviewed changes

test more tokenizers checkpoints

fea6f59

LysandreJik approved these changes Nov 9, 2023

View reviewed changes

ArthurZucker merged commit 085ea7e into main Nov 9, 2023

ArthurZucker deleted the nit-codellama branch November 9, 2023 09:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`CodeLlamaTokenizer`] Nit, update init to make sure the AddedTokens are not normalized because they are special#27359

[`CodeLlamaTokenizer`] Nit, update init to make sure the AddedTokens are not normalized because they are special#27359
ArthurZucker merged 4 commits intomainfrom
nit-codellama

ArthurZucker commented Nov 8, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Nov 8, 2023 •

edited

Loading

Uh oh!

LysandreJik Nov 9, 2023

Uh oh!

LysandreJik left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ArthurZucker commented Nov 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Nov 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LysandreJik Nov 9, 2023

Choose a reason for hiding this comment

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ArthurZucker commented Nov 8, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 8, 2023 •

edited

Loading