support gpt-oss function/reasoning in /v1/chat/completions by irexyc · Pull Request #3962 · InternLM/lmdeploy

irexyc · 2025-09-12T03:21:41Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

mizuikk · 2025-09-15T03:30:35Z

openwebui: reasoning-OK,
roocode: reasoning-OK, tool_use-failed.
claude:reasoning-don't display, mcp_use-failed.
codex: failed,get 404 with "--oss".

mizuikk · 2025-09-15T03:35:05Z

I receive an error when sending lmdeploy logs and in other content in chat.
//send in chat

2025-09-15 11:29:10,947 - lmdeploy - INFO - logger.py:45 - session=88, adapter_name=None, input_tokens=394, gen_config=GenerationConfig(n=1, max_new_tokens=None, do_sample=True, top_p=1.0, top_k=40, min_p=0.0, temperature=0.7, repetition_penalty=1.0, ignore_eos=False, random_seed=733978821877833456, stop_words=None, bad_words=None, stop_token_ids=[200002, 200012, 199999], bad_token_ids=None, min_new_tokens=None, skip_special_tokens=True, spaces_between_special_tokens=True, logprobs=None, response_format=None, logits_processors=None, output_logits=None, output_last_hidden_state=None, with_cache=False, preserve_cache=False, migration_request=None), prompt='<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\nKnowledge cutoff: 2024-06\nCurrent date: 2025-09-15\n\nReasoning: medium\n\n# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>### Task:\nGenerate 1-3 broad tags categorizing the main themes of the chat history, along with 1-3 more specific subtopic tags.\n\n### Guidelines:\n- Start with high-level domains (e.g. Science, Technology, Philosophy, Arts, Politics, Business, Health, Sports, Entertainment, Education)\n- Consider including relevant subfields/subdomains if they are strongly represented throughout the conversation\n- If content is too short (less than 3 messages) or too diverse, use only ["General"]\n- Use the chat\'s primary language; default to English if multilingual\n- Prioritize accuracy over specificity\n\n### Output:\nJSON format: { "tags": ["tag1", "tag2", "tag3"] }\n\n### Chat History:\n<chat_history>\nUSER: Tell me a random fun fact about the Roman Empire\nASSISTANT: **Fun Fact:**  \nThe ancient Romans had their very own version of fast‑food restaurants called **thermopolium** (plural: _thermopolia_). These were small, street‑side shops that sold ready‑made hot food and drinks—think pizza‑like flatbreads, stews, and wine—to people on the go. Archaeologists have uncovered the remains of dozens of these establishments in Pompeii and Herculaneum, complete with tiny counters, built‑in marble bowls, and even vending‑like jars. So, next time you grab a quick bite, remember that you’re following in the footsteps of Roman “thermopolium” diners that were the original fast‑food stops of the ancient world!\n</chat_history><|end|><|start|>assistant', prompt_token_id=[200006, 17360, 200008, 3575, 553, 17554, 162016, 11, 261, 4410, 6439, 2359, 22203, 656, 7788, 17527, 558, 87447, 100594, 25, 220, 1323, 19, 12, 3218, 198, 6576, 3521, 25, 220, 1323, 20, 12, 3114, 12, 1055, 279, 30377, 289, 25, 14093, 279, 2, 13888, 18403, 25, 8450, 11, 49159, 11, 1721, 13, 21030, 2804, 413, 7360, 395, 1753, 3176, 13, 200007, 200006, 1428, 200008, 31639, 10148, 734, 40569, 220, 16, 12, 18, 11245, 16613, 22794, 6396, 290, 2758, 28085, 328, 290, 7999, 5678, 11, 4251, 483, 220, 16, 12, 18, 945, 4857, 1543, 32799, 16613, 364, 31639, 64804, 734, 12, 7972, 483, 1932, 19231, 45513, 350, 68, 1940, 13, 13993, 11, 15101, 11, 77548, 11, 20188, 11, 65572, 11, 8084, 11, 7923, 11, 16950, 11, 31093, 11, 13626, 446, 12, 26632, 3463, 12331, 1543, 19358, 68155, 129051, 538, 1023, 553, 27203, 27328, 7876, 290, 15681, 198, 12, 1843, 3100, 382, 3101, 4022, 350, 2695, 1572, 220, 18, 10854, 8, 503, 3101, 15174, 11, 1199, 1606, 9129, 20051, 14510, 12, 7649, 290, 7999, 885, 9107, 6439, 26, 2787, 316, 7725, 538, 164467, 198, 12, 39936, 90696, 18580, 1072, 128212, 279, 31639, 18315, 734, 8259, 6011, 25, 354, 392, 27989, 1243, 9129, 7394, 16, 672, 392, 7394, 17, 672, 392, 7394, 18, 2601, 758, 31639, 17554, 15946, 734, 27, 13503, 45451, 523, 12440, 25, 38995, 668, 261, 7526, 2827, 2840, 1078, 290, 18753, 37049, 198, 7378, 6258, 4769, 25, 6240, 28128, 55581, 142760, 4066, 976, 21574, 75657, 1458, 1043, 1869, 2316, 3926, 328, 5661, 50005, 18646, 16043, 4358, 6240, 942, 76, 56987, 2883, 410, 350, 157694, 25, 425, 942, 76, 467, 47216, 9, 741, 5006, 1504, 3291, 11, 12901, 50005, 4341, 22509, 484, 8754, 6977, 50005, 33322, 3648, 4232, 326, 26899, 2322, 49631, 27941, 50005, 9995, 12508, 65, 51226, 11, 2310, 8811, 11, 326, 14967, 121655, 1665, 402, 290, 810, 13, 123351, 34279, 679, 81941, 290, 14777, 328, 42203, 328, 1879, 81048, 306, 109054, 3573, 326, 487, 93179, 1986, 394, 11, 5533, 483, 20049, 62038, 11, 8113, 50005, 258, 60584, 85942, 11, 326, 1952, 138793, 50005, 9995, 95690, 13, 2632, 11, 2613, 1058, 481, 18013, 261, 4853, 34262, 11, 8203, 484, 481, 3462, 3992, 306, 290, 110905, 328, 18753, 966, 942, 76, 56987, 2883, 693, 131645, 484, 1504, 290, 4756, 5661, 50005, 18646, 29924, 328, 290, 21574, 2375, 4175, 808, 13503, 45451, 29, 200007, 200006, 173781]

//openwebui error

'NoneType' object has no attribute 'find'

//error logs

2025-09-15 11:29:27,313 - lmdeploy - INFO - async_engine.py:739 - session=90, history_tokens=0, input_tokens=2292, max_new_tokens=None, seq_start=True, seq_end=True, step=0, prep=True
2025-09-15 11:29:27,313 - lmdeploy - INFO - turbomind.py:697 - [async_stream_infer] session 90 start
[TM][INFO] [SeqMgr][CachePrompt] ID 90, cached blocks 35, tokens 2292
2025-09-15 11:29:27,572 - lmdeploy - ERROR - async_engine.py:657 - [safe_run] exception caught: HarmonyError unexpected tokens remaining in message header: ["prompt_token_id=[200006,", "17360,", "200008,", "3575,", "553,", "17554,", "162016,", "11,", "261,", "4410,", "6439,", "2359,", "22203,", "656,", "7788,", "17527,", "558,", "87447,", "100594,", "25,", "220,", "1323,", "19,", "12,", "3218,", "198,", "6576,", "3521,", "25,", "220,", "1323,", "20,", "12,", "3114,", "12,", "1055,", "279,", "30377,", "289,", "25,", "14093,", "279,", "2,", "13888,", "18403,", "25,", "8450,", "11,", "49159,", "11,", "1721,", "13,", "21030,", "2804,", "413,", "7360,", "395,", "1753,", "3176,", "13,", "200007,", "200006,", "1428,", "200008,", "31639,", "10148,", "734,", "40569,", "220,", "16,", "12,", "18,", "11245,", "16613,", "22794,", "6396,", "290,", "2758,", "28085,", "328,", "290,", "7999,", "5678,", "11,", "4251,", "483,", "220,", "16,", "12,", "18,", "945,", "4857,", "1543,", "32799,", "16613,", "364,", "31639,", "64804,", "734,", "12,", "7972,", "483,", "1932,", "19231,", "45513,", "350,", "68,", "1940,", "13,", "13993,", "11,", "15101,", "11,", "77548,", "11,", "20188,", "11,", "65572,", "11,", "8084,", "11,", "7923,", "11,", "16950,", "11,", "31093,", "11,", "13626,", "446,", "12,", "26632,", "3463,", "12331,", "1543,", "19358,", "68155,", "129051,", "538,", "1023,", "553,", "27203,", "27328,", "7876,", "290,", "15681,", "198,", "12,", "1843,", "3100,", "382,", "3101,", "4022,", "350,", "2695,", "1572,", "220,", "18,", "10854,", "8,", "503,", "3101,", "15174,", "11,", "1199,", "1606,", "9129,", "20051,", "14510,", "12,", "7649,", "290,", "7999,", "885,", "9107,", "6439,", "26,", "2787,", "316,", "7725,", "538,", "164467,", "198,", "12,", "39936,", "90696,", "18580,", "1072,", "128212,", "279,", "31639,", "18315,", "734,", "8259,", "6011,", "25,", "354,", "392,", "27989,", "1243,", "9129,", "7394,", "16,", "672,", "392,", "7394,", "17,", "672,", "392,", "7394,", "18,", "2601,", "758,", "31639,", "17554,", "15946,", "734,", "27,", "13503,", "45451,", "523,", "12440,", "25,", "38995,", "668,", "261,", "7526,", "2827,", "2840,", "1078,", "290,", "18753,", "37049,", "198,", "7378,", "6258,", "4769,", "25,", "6240,", "28128,", "55581,", "142760,", "4066,", "976,", "21574,", "75657,", "1458,", "1043,", "1869,", "2316,", "3926,", "328,", "5661,", "50005,", "18646,", "16043,", "4358,", "6240,", "942,", "76,", "56987,", "2883,", "410,", "350,", "157694,", "25,", "425,", "942,", "76,", "467,", "47216,", "9,", "741,", "5006,", "1504,", "3291,", "11,", "12901,", "50005,", "4341,", "22509,", "484,", "8754,", "6977,", "50005,", "33322,", "3648,", "4232,", "326,", "26899,", "2322,", "49631,", "27941,", "50005,", "9995,", "12508,", "65,", "51226,", "11,", "2310,", "8811,", "11,", "326,", "14967,", "121655,", "1665,", "402,", "290,", "810,", "13,", "123351,", "34279,", "679,", "81941,", "290,", "14777,", "328,", "42203,", "328,", "1879,", "81048,", "306,", "109054,", "3573,", "326,", "487,", "93179,", "1986,", "394,", "11,", "5533,", "483,", "20049,", "62038,", "11,", "8113,", "50005,", "258,", "60584,", "85942,", "11,", "326,", "1952,", "138793,", "50005,", "9995,", "95690,", "13,", "2632,", "11,", "2613,", "1058,", "481,", "18013,", "261,", "4853,", "34262,", "11,", "8203,", "484,", "481,", "3462,", "3992,", "306,", "290,", "110905,", "328,", "18753,", "966,", "942,", "76,", "56987,", "2883,", "693,", "131645,", "484,", "1504,", "290,", "4756,", "5661,", "50005,", "18646,", "29924,", "328,", "290,", "21574,", "2375,", "4175,", "808,", "13503,", "45451,", "29,", "200007,", "200006,", "173781]"]
2025-09-15 11:29:27,572 - lmdeploy - INFO - turbomind.py:762 - [async_stream_infer] GeneratorExit
[TM][INFO] [Interrupt] slot 0, request 90, stop 1
[TM][INFO] [SeqMgr][CacheGeneration] ID 90, cached blocks 35, tokens 2299
2025-09-15 11:29:27,574 - lmdeploy - INFO - turbomind.py:774 - [async_stream_infer] session 90 done
2025-09-15 11:29:27,574 - lmdeploy - ERROR - async_engine.py:642 - [model_inst] exception caught: unexpected tokens remaining in message header: ["prompt_token_id=[200006,", "17360,", "200008,", "3575,", "553,", "17554,", "162016,", "11,", "261,", "4410,", "6439,", "2359,", "22203,", "656,", "7788,", "17527,", "558,", "87447,", "100594,", "25,", "220,", "1323,", "19,", "12,", "3218,", "198,", "6576,", "3521,", "25,", "220,", "1323,", "20,", "12,", "3114,", "12,", "1055,", "279,", "30377,", "289,", "25,", "14093,", "279,", "2,", "13888,", "18403,", "25,", "8450,", "11,", "49159,", "11,", "1721,", "13,", "21030,", "2804,", "413,", "7360,", "395,", "1753,", "3176,", "13,", "200007,", "200006,", "1428,", "200008,", "31639,", "10148,", "734,", "40569,", "220,", "16,", "12,", "18,", "11245,", "16613,", "22794,", "6396,", "290,", "2758,", "28085,", "328,", "290,", "7999,", "5678,", "11,", "4251,", "483,", "220,", "16,", "12,", "18,", "945,", "4857,", "1543,", "32799,", "16613,", "364,", "31639,", "64804,", "734,", "12,", "7972,", "483,", "1932,", "19231,", "45513,", "350,", "68,", "1940,", "13,", "13993,", "11,", "15101,", "11,", "77548,", "11,", "20188,", "11,", "65572,", "11,", "8084,", "11,", "7923,", "11,", "16950,", "11,", "31093,", "11,", "13626,", "446,", "12,", "26632,", "3463,", "12331,", "1543,", "19358,", "68155,", "129051,", "538,", "1023,", "553,", "27203,", "27328,", "7876,", "290,", "15681,", "198,", "12,", "1843,", "3100,", "382,", "3101,", "4022,", "350,", "2695,", "1572,", "220,", "18,", "10854,", "8,", "503,", "3101,", "15174,", "11,", "1199,", "1606,", "9129,", "20051,", "14510,", "12,", "7649,", "290,", "7999,", "885,", "9107,", "6439,", "26,", "2787,", "316,", "7725,", "538,", "164467,", "198,", "12,", "39936,", "90696,", "18580,", "1072,", "128212,", "279,", "31639,", "18315,", "734,", "8259,", "6011,", "25,", "354,", "392,", "27989,", "1243,", "9129,", "7394,", "16,", "672,", "392,", "7394,", "17,", "672,", "392,", "7394,", "18,", "2601,", "758,", "31639,", "17554,", "15946,", "734,", "27,", "13503,", "45451,", "523,", "12440,", "25,", "38995,", "668,", "261,", "7526,", "2827,", "2840,", "1078,", "290,", "18753,", "37049,", "198,", "7378,", "6258,", "4769,", "25,", "6240,", "28128,", "55581,", "142760,", "4066,", "976,", "21574,", "75657,", "1458,", "1043,", "1869,", "2316,", "3926,", "328,", "5661,", "50005,", "18646,", "16043,", "4358,", "6240,", "942,", "76,", "56987,", "2883,", "410,", "350,", "157694,", "25,", "425,", "942,", "76,", "467,", "47216,", "9,", "741,", "5006,", "1504,", "3291,", "11,", "12901,", "50005,", "4341,", "22509,", "484,", "8754,", "6977,", "50005,", "33322,", "3648,", "4232,", "326,", "26899,", "2322,", "49631,", "27941,", "50005,", "9995,", "12508,", "65,", "51226,", "11,", "2310,", "8811,", "11,", "326,", "14967,", "121655,", "1665,", "402,", "290,", "810,", "13,", "123351,", "34279,", "679,", "81941,", "290,", "14777,", "328,", "42203,", "328,", "1879,", "81048,", "306,", "109054,", "3573,", "326,", "487,", "93179,", "1986,", "394,", "11,", "5533,", "483,", "20049,", "62038,", "11,", "8113,", "50005,", "258,", "60584,", "85942,", "11,", "326,", "1952,", "138793,", "50005,", "9995,", "95690,", "13,", "2632,", "11,", "2613,", "1058,", "481,", "18013,", "261,", "4853,", "34262,", "11,", "8203,", "484,", "481,", "3462,", "3992,", "306,", "290,", "110905,", "328,", "18753,", "966,", "942,", "76,", "56987,", "2883,", "693,", "131645,", "484,", "1504,", "290,", "4756,", "5661,", "50005,", "18646,", "29924,", "328,", "290,", "21574,", "2375,", "4175,", "808,", "13503,", "45451,", "29,", "200007,", "200006,", "173781]"]

irexyc · 2025-09-15T08:14:31Z

@mizuikk Thanks for your feedback.

I receive an error when sending lmdeploy logs and in other content in chat.

This should be fixed.

roocode: reasoning-OK, tool_use-failed.
claude:reasoning-don't display, mcp_use-failed.
codex: failed,get 404 with "--oss".

Could you provide more detail information like inputs ?

lvhan028 · 2025-09-15T11:22:13Z

which parser should be specified?

mizuikk · 2025-09-16T01:44:55Z

@mizuikk Thanks for your feedback.

I receive an error when sending lmdeploy logs and in other content in chat.

This should be fixed.

roocode: reasoning-OK, tool_use-failed.
claude:reasoning-don't display, mcp_use-failed.
codex: failed,get 404 with "--oss".

Could you provide more detail information like inputs ?

CMD:

lmdeploy serve api_server --log-level INFO --tp 2 --enable-prefix-caching --model-name ld/gpt-oss-20b /gpt-oss-20b

openwebui:
The 'NoneType' object has no attribute 'find' error still occurs occasionally in OpenWebUI, but I can't reproduce it.

roo code(can not use any tool):

claude(In Claude, this model seems unable to understand human instructions.):

Codex CLI seems to be just an official compatibility issue.

mizuikk · 2025-09-16T02:11:40Z

I set reasoning_effort to "high" in the model settings of OpenWebUI, but it doesn't seem to be working.
This model now seems much dumber than when using the --chat-template-kwargs "{\"reasoning_effort\": \"high\"}" setting in llama.cpp.
//log

2025-09-16 10:06:10,754 - lmdeploy - INFO - logger.py:45 - session=1, adapter_name=None, input_tokens=92, gen_config=GenerationConfig(n=1, max_new_tokens=None, do_sample=True, top_p=1.0, top_k=40, min_p=0.0, temperature=0.7, repetition_penalty=1.0, ignore_eos=False, random_seed=1757977006890935674, stop_words=None, bad_words=None, stop_token_ids=[200002, 200012, 199999], bad_token_ids=None, min_new_tokens=None, skip_special_tokens=True, spaces_between_special_tokens=True, logprobs=None, response_format=None, logits_processors=None, output_logits=None, output_last_hidden_state=None, with_cache=False, preserve_cache=False, migration_request=None), prompt="<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\nKnowledge cutoff: 2024-06\nCurrent date: 2025-09-16\n\nReasoning: high\n\n# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>Help me study vocabulary: write a sentence for me to fill in the blank, and I'll try to pick the correct option.<|end|><|start|>assistant", prompt_token_id=[200006, 17360, 200008, 3575, 553, 17554, 162016, 11, 261, 4410, 6439, 2359, 22203, 656, 7788, 17527, 558, 87447, 100594, 25, 220, 1323, 19, 12, 3218, 198, 6576, 3521, 25, 220, 1323, 20, 12, 3114, 12, 1125, 279, 30377, 289, 25, 1932, 279, 2, 13888, 18403, 25, 8450, 11, 49159, 11, 1721, 13, 21030, 2804, 413, 7360, 395, 1753, 3176, 13, 200007, 200006, 1428, 200008, 14470, 668, 5012, 50039, 25, 5067, 261, 21872, 395, 668, 316, 6954, 306, 290, 16953, 11, 326, 17291, 2075, 316, 5230, 290, 6145, 5317, 13, 200007, 200006, 173781]
2025-09-16 10:06:10,754 - lmdeploy - INFO - async_engine.py:739 - session=1, history_tokens=0, input_tokens=92, max_new_tokens=None, seq_start=True, seq_end=True, step=0, prep=True
2025-09-16 10:06:10,755 - lmdeploy - INFO - turbomind.py:697 - [async_stream_infer] session 1 start

irexyc · 2025-09-16T08:58:38Z

Due to network issues, I can't log in to Codex. I tried the Roo-Code plugin and found that when calling the inference engine(lmdeploy/vllm), it don't pass tools params. Instead it instruct the LLM to output the result with a specific tag. If the specific tag is missing, Roo-Code assumes there’s an issue with the result (displaying 'Roo is having trouble').

I test vLLM and find its success rate to be higher. Comparing the inputs, I find two differences. First, according to this document, the system message should be treated as a developer message in gpt-oss. vLLM doesn’t handle this correctly, while LMDeploy does. The second difference is that for inputs without tools, vLLM removes the commentary channel. After removing this field, LMDeploy showed a significant improvement in success rate. (I'm not sure if this would also be helpful for the use of OpenWebUI)

This model now seems much dumber than when using the --chat-template-kwargs "{"reasoning_effort": "high"}" setting in llama.cpp.

I am not sure it is related to computational precision in rmsnorm, will test it soon.

mizuikk · 2025-09-16T09:02:57Z

I am not sure it is related to computational precision in rmsnorm, will test it soon.

Is there a way to set the reasoning_effort parameter when starting the api_server?

 lmdeploy serve api_server --reasoning_effort high

irexyc · 2025-09-16T09:35:04Z

Is there a way to set the reasoning_effort parameter when starting the api_server?

No, you can only set it in in request. According to your log, you actually set it. You can find Reasoning: high in prompt.

I run your query ("Help me study vocabulary: write a sentence for me to fill in the blank, and I'll try to pick the correct option.") locally, but the response fall into repetition. We encountered this on some test datasets and suspected it is caused by rmsnorm but not yet tested

The user wants help studying vocabulary. They want a sentence with a blank, and they will try to pick the correct option. So we need to provide a fill-in-the-blank sentence with multiple choice options. The user will then choose the correct one. We should probably provide a few such sentences, maybe 5-10, each with a blank and multiple choice options. The user can then answer. We can also provide the correct answer after they respond, or we can give them the answer after they answer. But the user hasn't responded yet. So we should provide the sentences and options. We can also ask them to respond with their answer. We can also provide a short explanation after they answer. But the user hasn't answered yet. So we should just provide the sentences and options. We can also ask them to pick the correct option for each sentence. We can also provide a short explanation after each answer. But we can also ask them to respond with their answer. The user might want to test themselves. So we can provide a set of sentences with blanks and options. Then we can ask them to fill in the blanks. We can also provide the correct answer after they respond. But we can also provide the correct answer immediately. But the user might want to test themselves. So we can provide the sentences and options, and ask them to respond with their answer. Then we can provide the correct answer. But we can also provide the correct answer after each sentence. But that might reduce the challenge. So we can provide the sentences and options, and ask them to respond with their answer. Then we can provide the correct answer. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they respond. But we can also provide the correct answer after each sentence. But we can also provide the correct answer after they

lmdeploy/serve/openai/harmony_utils.py

lmdeploy/serve/openai/api_server.py

RunningLeon

LGTM

radna0 · 2025-12-03T05:04:11Z

Does the Offline Inference Pipeline not support this at all? @RunningLeon @irexyc

irexyc · 2025-12-03T07:51:54Z

@radna0 tool parser/reasoning parser only support serving api.

irexyc added 9 commits September 8, 2025 08:50

support gpt-oss final output

6433120

support reasoning_effort

0d1a522

output reasoning content

e38e7dc

fix reasoning_effort

b8336dd

update

4edab33

fix ut

6d8921b

support gpt-oss function/reasoning in /v1/chat/completions

d85d5d2

Merge remote-tracking branch 'origin/main' into gpt-oss-io

7b8f3ec

fix lint

56a40d6

lvhan028 self-requested a review September 12, 2025 10:03

lvhan028 added the enhancement New feature or request label Sep 12, 2025

irexyc mentioned this pull request Sep 12, 2025

[Bug] Error when requesting json output with GPT-OSS #3964

Closed

3 tasks

skip process prompt tokens

1130167

irexyc added 2 commits September 16, 2025 08:04

remove commentary channel when no tools are provided

8f4ceb3

Merge remote-tracking branch 'origin/main' into gpt-oss-io

a3fd2e3

update

01c37e2

lvhan028 requested review from CUHKSZzxy and RunningLeon September 17, 2025 07:41

lvhan028 approved these changes Sep 17, 2025

View reviewed changes

RunningLeon reviewed Sep 18, 2025

View reviewed changes

lmdeploy/serve/openai/harmony_utils.py Show resolved Hide resolved

RunningLeon reviewed Sep 18, 2025

View reviewed changes

lmdeploy/serve/openai/api_server.py Show resolved Hide resolved

reduce warning

fd896b2

RunningLeon approved these changes Sep 18, 2025

View reviewed changes

lvhan028 merged commit e7cbc54 into InternLM:main Sep 18, 2025
5 checks passed

Conversation

irexyc commented Sep 12, 2025

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Uh oh!

mizuikk commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mizuikk commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

irexyc commented Sep 15, 2025

Uh oh!

lvhan028 commented Sep 15, 2025

Uh oh!

mizuikk commented Sep 16, 2025

Uh oh!

mizuikk commented Sep 16, 2025

Uh oh!

irexyc commented Sep 16, 2025

Uh oh!

mizuikk commented Sep 16, 2025

Uh oh!

irexyc commented Sep 16, 2025

Uh oh!

Uh oh!

Uh oh!

RunningLeon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

radna0 commented Dec 3, 2025

Uh oh!

irexyc commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mizuikk commented Sep 15, 2025 •

edited

Loading

mizuikk commented Sep 15, 2025 •

edited

Loading