Skip to content

Commit 1b3785a

Browse files
committed
0919 change examples for Qwen2.5-Coder
1 parent 7a7faf8 commit 1b3785a

9 files changed

+76
-59
lines changed

README.md

Lines changed: 27 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,16 @@ This update focuses on two main improvements: scaling up the code training data
4444
> We updates both the special tokens and their corresponding token ids, in order to maintain consistency with Qwen2.5. The new special tokens are as the following:
4545
4646
```json
47-
{'<|fim_prefix|>': 151659, '<|fim_middle|>': 151660, '<|fim_suffix|>': 151661, '<|fim_pad|>': 151662, '<|repo_name|>': 151663, '<|file_sep|>': 151664, '<|im_start|>': 151644, '<|im_end|>': 151645}
47+
{
48+
"<|fim_prefix|>": 151659,
49+
"<|fim_middle|>": 151660,
50+
"<|fim_suffix|>": 151661,
51+
"<|fim_pad|>": 151662,
52+
"<|repo_name|>": 151663,
53+
"<|file_sep|>": 151664,
54+
"<|im_start|>": 151644,
55+
"<|im_end|>": 151645
56+
}
4857
```
4958

5059
| model name | type | length | Download |
@@ -76,9 +85,9 @@ pip install -r requirements.txt
7685
## Quick Start
7786

7887
> [!Important]
79-
> **Qwen2.5-Coder-xB-Chat** are instruction models for chatting;
88+
> **Qwen2.5-Coder-\[1.5-7\]B-Instrcut** are instruction models for chatting;
8089
>
81-
> **Qwen2.5-Coder-xB** is a base model typically used for completion, serving as a better starting point for fine-tuning.
90+
> **Qwen2.5-Coder-\[1.5-7\]B** is a base model typically used for completion, serving as a better starting point for fine-tuning.
8291
>
8392
### 👉🏻 Chat with Qwen2.5-Coder-7B-Instruct
8493
You can just write several lines of code with `transformers` to chat with Qwen2.5-Coder-7B-Instruct. Essentially, we build the tokenizer and the model with `from_pretrained` method, and we use generate method to perform chatting with the help of chat template provided by the tokenizer. Below is an example of how to chat with Qwen2.5-Coder-7B-Instruct:
@@ -152,7 +161,7 @@ The `max_new_tokens` argument is used to set the maximum length of the response.
152161
The `input_text` could be any text that you would like model to continue with.
153162

154163

155-
#### 2.Processing Long Texts
164+
#### 2. Processing Long Texts
156165

157166
The current `config.json` is set for context length up to 32,768 tokens.
158167
To handle extensive inputs exceeding 32,768 tokens, we utilize [YaRN](https://arxiv.org/abs/2309.00071), a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts.
@@ -371,18 +380,26 @@ llm = LLM(model="Qwen/Qwen2.5-Coder-7B", tensor_parallel_size=4)
371380

372381

373382
## Performance
374-
see blog <a href="https://qwenlm.github.io/blog/qwen2.5-coder"> 📑 blog</a>.
383+
see blog first <a href="https://qwenlm.github.io/blog/qwen2.5-coder"> 📑 blog</a>.
384+
375385

376386

377387
## Citation
378388
If you find our work helpful, feel free to give us a cite.
379389

380390
```bibtex
381-
@article{qwen,
382-
title={Qwen Technical Report},
383-
author={Jinze Bai and Shuai Bai and Yunfei Chu and Zeyu Cui and Kai Dang and Xiaodong Deng and Yang Fan and Wenbin Ge and Yu Han and Fei Huang and Binyuan Hui and Luo Ji and Mei Li and Junyang Lin and Runji Lin and Dayiheng Liu and Gao Liu and Chengqiang Lu and Keming Lu and Jianxin Ma and Rui Men and Xingzhang Ren and Xuancheng Ren and Chuanqi Tan and Sinan Tan and Jianhong Tu and Peng Wang and Shijie Wang and Wei Wang and Shengguang Wu and Benfeng Xu and Jin Xu and An Yang and Hao Yang and Jian Yang and Shusheng Yang and Yang Yao and Bowen Yu and Hongyi Yuan and Zheng Yuan and Jianwei Zhang and Xingxuan Zhang and Yichang Zhang and Zhenru Zhang and Chang Zhou and Jingren Zhou and Xiaohuan Zhou and Tianhang Zhu},
384-
journal={arXiv preprint arXiv:2309.16609},
385-
year={2023}
391+
@misc{qwen2.5,
392+
title = {Qwen2.5: A Party of Foundation Models},
393+
url = {https://qwenlm.github.io/blog/qwen2.5/},
394+
author = {Qwen Team},
395+
month = {September},
396+
year = {2024}
397+
}
398+
@article{qwen2,
399+
title={Qwen2 Technical Report},
400+
author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
401+
journal={arXiv preprint arXiv:2407.10671},
402+
year={2024}
386403
}
387404
```
388405

examples/Qwen2.5-Coder-Instruct-stream.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55
device = "cuda" # the device to load the model onto
66

77
# Now you do not need to add "trust_remote_code=True"
8-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B-Chat")
9-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B-Chat", device_map="auto").eval()
8+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
9+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct", device_map="auto").eval()
1010

1111
# model = AutoModelForCausalLM.from_pretrained(
1212
# "Qwen/CodeQwen1.5-7B-Chat",

examples/Qwen2.5-Coder-Instruct.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
1-
# Use CodeQwen1.5-base-chat By transformers
2-
The most significant but also the simplest usage of CodeQwen1.5-base-chat is using the `transformers` library. In this document, we show how to chat with CodeQwen1.5-base-chat in either streaming mode or not.
1+
# Use Qwen2.5-Coder-7B-Instruct By transformers
2+
The most significant but also the simplest usage of Qwen2.5-Coder-7B-Instruct is using the `transformers` library. In this document, we show how to chat with Qwen2.5-Coder-7B-Instruct in either streaming mode or not.
33

44
## Basic Usage
5-
You can just write several lines of code with `transformers` to chat with CodeQwen1.5-7B-Chat. Essentially, we build the tokenizer and the model with `from_pretrained` method, and we use generate method to perform chatting with the help of chat template provided by the tokenizer. Below is an example of how to chat with CodeQwen1.5-7B-Chat:
5+
You can just write several lines of code with `transformers` to chat with Qwen2.5-Coder-7B-Instruct. Essentially, we build the tokenizer and the model with `from_pretrained` method, and we use generate method to perform chatting with the help of chat template provided by the tokenizer. Below is an example of how to chat with Qwen2.5-Coder-7B-Instruct:
66

77
```python
88
from transformers import AutoTokenizer, AutoModelForCausalLM
99

1010
device = "cuda" # the device to load the model onto
1111

1212
# Now you do not need to add "trust_remote_code=True"
13-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B-Chat")
14-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B-Chat", device_map="auto").eval()
13+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
14+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct", device_map="auto").eval()
1515

1616
# tokenize the input into tokens
1717

examples/Qwen2.5-Coder-Instruct.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
device = "cuda" # the device to load the model onto
44

55
# Now you do not need to add "trust_remote_code=True"
6-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B-Chat")
7-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B-Chat", device_map="auto").eval()
6+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
7+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct", device_map="auto").eval()
88

99
# tokenize the input into tokens
1010

examples/Qwen2.5-Coder-fim.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,17 @@
22
# load model
33
device = "cuda" # the device to load the model onto
44

5-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B")
6-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B", device_map="auto").eval()
5+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B")
6+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B", device_map="auto").eval()
77

8-
input_text = """<fim_prefix>def quicksort(arr):
8+
input_text = """<|fim_prefix|>def quicksort(arr):
99
if len(arr) <= 1:
1010
return arr
1111
pivot = arr[len(arr) // 2]
12-
<fim_suffix>
12+
<|fim_suffix|>
1313
middle = [x for x in arr if x == pivot]
1414
right = [x for x in arr if x > pivot]
15-
return quicksort(left) + middle + quicksort(right)<fim_middle>"""
15+
return quicksort(left) + middle + quicksort(right)<|fim_middle|>"""
1616

1717
model_inputs = tokenizer([input_text], return_tensors="pt").to(device)
1818

examples/Qwen2.5-Coder-repolevel-fim.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22
device = "cuda" # the device to load the model onto
33

44
# Now you do not need to add "trust_remote_code=True"
5-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B")
6-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B", device_map="auto").eval()
5+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B")
6+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B", device_map="auto").eval()
77

88
# tokenize the input into tokens
99
# set fim format into the corresponding file you need to infilling
10-
input_text = """<repo_name>library-system
11-
<file_sep>library.py
10+
input_text = """<|repo_name|>library-system
11+
<|file_sep|>library.py
1212
class Book:
1313
def __init__(self, title, author, isbn, copies):
1414
self.title = title
@@ -36,7 +36,7 @@ def find_book(self, isbn):
3636
def list_books(self):
3737
return self.books
3838
39-
<file_sep>student.py
39+
<|file_sep|>student.py
4040
class Student:
4141
def __init__(self, name, id):
4242
self.name = name
@@ -57,8 +57,8 @@ def return_book(self, book, library):
5757
return True
5858
return False
5959
60-
<file_sep>main.py
61-
<fim_prefix>from library import Library
60+
<|file_sep|>main.py
61+
<|fim_prefix|>from library import Library
6262
from student import Student
6363
6464
def main():
@@ -70,7 +70,7 @@ def main():
7070
# Set up a student
7171
student = Student("Alice", "S1")
7272
73-
# Student borrows a book<fim_suffix>
73+
# Student borrows a book<|fim_suffix|>
7474
if student.borrow_book(book, library):
7575
print(f"{student.name} borrowed {book.title}")
7676
else:
@@ -88,7 +88,7 @@ def main():
8888
print(book)
8989
9090
if __name__ == "__main__":
91-
main()<fim_middle>
91+
main()<|fim_middle|>
9292
"""
9393
model_inputs = tokenizer([input_text], return_tensors="pt").to(device)
9494

@@ -97,7 +97,7 @@ def main():
9797
# The generated_ids include prompt_ids, so we only need to decode the tokens after prompt_ids.
9898
output_text = tokenizer.decode(generated_ids[len(model_inputs.input_ids[0]):], skip_special_tokens=True)
9999

100-
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<file_sep>')[0]}")
100+
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<|file_sep|>')[0]}")
101101

102102
# the expected output as following:
103103
"""

examples/Qwen2.5-Coder-repolevel.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
device = "cuda" # the device to load the model onto
33

44
# Now you do not need to add "trust_remote_code=True"
5-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B")
6-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B", device_map="auto").eval()
5+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B")
6+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B", device_map="auto").eval()
77

88
# tokenize the input into tokens
9-
input_text = """<repo_name>library-system
10-
<file_sep>library.py
9+
input_text = """<|repo_name|>library-system
10+
<|file_sep|>library.py
1111
class Book:
1212
def __init__(self, title, author, isbn, copies):
1313
self.title = title
@@ -35,7 +35,7 @@ def find_book(self, isbn):
3535
def list_books(self):
3636
return self.books
3737
38-
<file_sep>student.py
38+
<|file_sep|>student.py
3939
class Student:
4040
def __init__(self, name, id):
4141
self.name = name
@@ -56,7 +56,7 @@ def return_book(self, book, library):
5656
return True
5757
return False
5858
59-
<file_sep>main.py
59+
<|file_sep|>main.py
6060
from library import Library
6161
from student import Student
6262
@@ -78,7 +78,7 @@ def main():
7878
# The generated_ids include prompt_ids, so we only need to decode the tokens after prompt_ids.
7979
output_text = tokenizer.decode(generated_ids[len(model_inputs.input_ids[0]):], skip_special_tokens=True)
8080

81-
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<file_sep>')[0]}")
81+
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<|file_sep|>')[0]}")
8282

8383
# the expected output as following:
8484
"""

examples/Qwen2.5-Coder.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
# Use CodeQwen1.5-base By transformers
2-
One of the simple but fundamental ways to try CodeQwen1.5-base is to use the `transformers` library. In this document, we show how to use CodeQwen1.5-base in three common scenarios of code generation, respectively.
1+
# Use Qwen2.5-Coder-7B By transformers
2+
One of the simple but fundamental ways to try Qwen2.5-Coder-7B is to use the `transformers` library. In this document, we show how to use Qwen2.5-Coder-7B in three common scenarios of code generation, respectively.
33

44

55
## Basic Usage
66
The model completes the code snipplets according to the given prompts, without any additional formatting, which is usually termed as `code completion` in the code generation tasks.
77

8-
Essentially, we build the tokenizer and the model with `from_pretrained` method, and we use generate method to perform code completion. Below is an example on how to chat with CodeQwen1.5-base:
8+
Essentially, we build the tokenizer and the model with `from_pretrained` method, and we use generate method to perform code completion. Below is an example on how to chat with Qwen2.5-Coder-7B:
99
```python
1010
from transformers import AutoTokenizer, AutoModelForCausalLM
1111

@@ -53,7 +53,7 @@ input_text = """<|fim_prefix|>def quicksort(arr):
5353
<|fim_suffix|>
5454
middle = [x for x in arr if x == pivot]
5555
right = [x for x in arr if x > pivot]
56-
return quicksort(left) + middle + quicksort(right)<fim_middle>"""
56+
return quicksort(left) + middle + quicksort(right)<|fim_middle|>"""
5757

5858
model_inputs = tokenizer([input_text], return_tensors="pt").to(device)
5959

@@ -88,7 +88,7 @@ model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B", device_map
8888

8989
# tokenize the input into tokens
9090
input_text = """<repo_name>library-system
91-
<file_sep>library.py
91+
<|file_sep|>library.py
9292
class Book:
9393
def __init__(self, title, author, isbn, copies):
9494
self.title = title
@@ -116,7 +116,7 @@ class Library:
116116
def list_books(self):
117117
return self.books
118118
119-
<file_sep>student.py
119+
<|file_sep|>student.py
120120
class Student:
121121
def __init__(self, name, id):
122122
self.name = name
@@ -137,7 +137,7 @@ class Student:
137137
return True
138138
return False
139139
140-
<file_sep>main.py
140+
<|file_sep|>main.py
141141
from library import Library
142142
from student import Student
143143
@@ -159,7 +159,7 @@ generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024, do_s
159159
# The generated_ids include prompt_ids, so we only need to decode the tokens after prompt_ids.
160160
output_text = tokenizer.decode(generated_ids[len(model_inputs.input_ids[0]):], skip_special_tokens=True)
161161

162-
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<file_sep>')[0]}")
162+
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<|file_sep|>')[0]}")
163163

164164
```
165165
The expected output as following:
@@ -209,8 +209,8 @@ model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B", device_map
209209

210210
# tokenize the input into tokens
211211
# set fim format into the corresponding file you need to infilling
212-
input_text = """<repo_name>library-system
213-
<file_sep>library.py
212+
input_text = """<|repo_name|>library-system
213+
<|file_sep|>library.py
214214
class Book:
215215
def __init__(self, title, author, isbn, copies):
216216
self.title = title
@@ -238,7 +238,7 @@ class Library:
238238
def list_books(self):
239239
return self.books
240240
241-
<file_sep>student.py
241+
<|file_sep|>student.py
242242
class Student:
243243
def __init__(self, name, id):
244244
self.name = name
@@ -290,7 +290,7 @@ def main():
290290
print(book)
291291
292292
if __name__ == "__main__":
293-
main()<fim_middle>
293+
main()<|fim_middle|>
294294
"""
295295
model_inputs = tokenizer([input_text], return_tensors="pt").to(device)
296296

@@ -299,7 +299,7 @@ generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024, do_s
299299
# The generated_ids include prompt_ids, so we only need to decode the tokens after prompt_ids.
300300
output_text = tokenizer.decode(generated_ids[len(model_inputs.input_ids[0]):], skip_special_tokens=True)
301301

302-
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<file_sep>')[0]}")
302+
print(f"Prompt: \n{input_text}\n\nGenerated text: \n{output_text.split('<|file_sep|>')[0]}")
303303

304304
# the expected output as following:
305305
"""
@@ -308,8 +308,8 @@ Generated text:
308308
"""
309309
```
310310

311-
# Use CodeQwen1.5-base By vllm
312-
As a family member of Qwen1.5, CodeQwen1.5 are supported by vLLM. The detail tutorial could be found in [Qwen tutorial](https://qwen.readthedocs.io/en/latest/deployment/vllm.html).
311+
# Use Qwen2.5-Coder-7B By vllm
312+
As a family member of Qwen2.5, Qwen2.5-Coder-7B are supported by vLLM. The detail tutorial could be found in [Qwen tutorial](https://qwen.readthedocs.io/en/latest/deployment/vllm.html).
313313
Here, we only give you an simple example of offline batched inference in vLLM.
314314

315315
## Offline Batched Inference
@@ -349,7 +349,7 @@ llm = LLM(model="Qwen/Qwen2.5-Coder-7B", tensor_parallel_size=4)
349349

350350
## Streaming Mode
351351

352-
With the help of `TextStreamer`, you can modify generation with CodeQwen to streaming mode. Below we show you an example of how to use it:
352+
With the help of `TextStreamer`, you can modify generation with Qwen2.5-Coder to streaming mode. Below we show you an example of how to use it:
353353

354354

355355
```python

examples/Qwen2.5-Coder.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
device = "cuda" # the device to load the model onto
33

44
# Now you do not need to add "trust_remote_code=True"
5-
tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B")
6-
model = AutoModelForCausalLM.from_pretrained("Qwen/CodeQwen1.5-7B", device_map="auto").eval()
5+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B")
6+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B", device_map="auto").eval()
77

88

99
# tokenize the input into tokens

0 commit comments

Comments
 (0)