common: add two-phase graceful reasoning budget termination ... by zeel2104 · Pull Request #21141 · ggml-org/llama.cpp

zeel2104 · 2026-03-29T03:03:37Z

Overview

Implements Option B from issue #20632 a two-phase graceful termination for the reasoning budget sampler, replacing hard truncation with a structured conclusion phase.
Previously, --reasoning-budget-message injected a wrap-up string at exactly token N, leaving the model zero tokens to act on it functionally equivalent to raw truncation. This PR splits the budget into a thinking phase and a conclusion phase:

At the end of the thinking budget, the message is forced token-by-token (INJECTING state)
The model is then given --reasoning-budget-conclusion N free tokens to terminate naturally (CONCLUDING state)
If the model does not produce the end tag within those tokens, the hard-cutoff safety net fires (FORCING state) preserving existing behavior

New state machine: IDLE → COUNTING → INJECTING → CONCLUDING → FORCING → DONE
Setting --reasoning-budget-conclusion 0 (the default) preserves existing behavior exactly fully backward compatible.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: I used AI for minor assistance. I have reviewed all changes, understand them fully, and take responsibility for the submitted code.

zeel2104 · 2026-03-29T03:09:52Z

I did disclose I used AI for minor assistance. Thank you

pwilkin

Nice extension, but please add tests into test-reasoning-budget.cpp

ggerganov · 2026-03-31T10:55:09Z

    common_reasoning_format reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK;
    int enable_reasoning = -1; // -1 = auto, 0 = disable, 1 = enable
    int reasoning_budget = -1;
+    int reasoning_budget_conclusion = 0; // tokens reserved for conclusion phase (0 = disabled)


This should not be added in common_params. Better to first fix #20429 before merging this.

Add --reasoning-budget-conclusion N flag that splits the reasoning budget into a thinking phase and a conclusion phase: - At end of thinking budget, inject --reasoning-budget-message and enter INJECTING state (forces message tokens token-by-token) - After message is injected, enter CONCLUDING state giving the model N free tokens to terminate naturally - If model does not self-terminate, fall through to FORCING (hard cutoff) as a safety net New states added to the sampler state machine: IDLE -> COUNTING -> INJECTING -> CONCLUDING -> FORCING -> DONE Setting --reasoning-budget-conclusion 0 (the default) preserves existing behavior exactly — fully backward compatible. Add 5 new tests to test-reasoning-budget.cpp covering: - natural end in conclusion window (no FORCING) - conclusion budget exhausted, safety net fires - no message tokens, conclusion budget only - backward compat with conclusion_budget=0 - multi-token message injection Implements Option B from issue ggml-org#20632.

zeel2104 · 2026-04-01T04:20:27Z

reasoning_budget_conclusion has been removed from common_params. It now lives only in common_params_sampling, consistent with the direction of #20429. The CLI arg writes directly to params.sampling.reasoning_budget_conclusion only.

MarkErik · 2026-04-02T16:32:04Z

I hope that this gets added soon!

I recently updated from 8500 to 8600, which I believe must have included the new reasoning budget, and now many coding tasks in Roo Code just stall during the thinking phase. (Using GLM4.7)

vanmilleru · 2026-04-04T11:31:01Z

Can it also support text-based reasoning effort (none, low, medium, high)?
openai api has this and many projects like Openwebui are already using this as reasoning budget

pwilkin · 2026-05-01T16:02:34Z

Can you please fix conflicts?

common: add two-phase graceful reasoning budget termination ...

be5cd55

zeel2104 requested a review from a team as a code owner March 29, 2026 03:03

This comment was marked as off-topic.

Sign in to view

pwilkin requested changes Mar 29, 2026

View reviewed changes

zeel2104 requested a review from ggerganov as a code owner March 30, 2026 01:42

github-actions Bot added the testing Everything test related label Mar 30, 2026

ggerganov reviewed Mar 31, 2026

View reviewed changes

zeel2104 force-pushed the feat/reasoning-budget-conclusion branch from deb1c35 to 02d4c32 Compare April 1, 2026 04:19

Frank-Schruefer mentioned this pull request Apr 1, 2026

Feature Request: Line-level repetition detection to break thinking loops #21264

Closed

This was referenced Apr 4, 2026

Request: add a reasoning button to web UI chat form #20001

Open

Feature Request: WebUI add reasoning effort level on a per-message basis #18405

Open

vanmilleru mentioned this pull request Apr 13, 2026

Misc. bug: enable_thinking param cannot turn off thinking for qwen3.5 #20182

Open

BruceJillis mentioned this pull request Apr 24, 2026

common : re-arm reasoning budget after DONE on new <think> #22323

Merged

smartprogrammer93 mentioned this pull request May 1, 2026

Feature Request: Graceful reasoning budget termination. Avoid mid-sentence cutoff. #20632

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common: add two-phase graceful reasoning budget termination ...#21141

common: add two-phase graceful reasoning budget termination ...#21141
zeel2104 wants to merge 2 commits into
ggml-org:masterfrom
zeel2104:feat/reasoning-budget-conclusion

zeel2104 commented Mar 29, 2026

Uh oh!

This comment was marked as off-topic.

zeel2104 commented Mar 29, 2026

Uh oh!

pwilkin left a comment

Uh oh!

ggerganov Mar 31, 2026

Uh oh!

zeel2104 commented Apr 1, 2026

Uh oh!

MarkErik commented Apr 2, 2026

Uh oh!

vanmilleru commented Apr 4, 2026

Uh oh!

pwilkin commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

zeel2104 commented Mar 29, 2026

Overview

Requirements

Uh oh!

This comment was marked as off-topic.

zeel2104 commented Mar 29, 2026

Uh oh!

pwilkin left a comment

Choose a reason for hiding this comment

Uh oh!

ggerganov Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

zeel2104 commented Apr 1, 2026

Uh oh!

MarkErik commented Apr 2, 2026

Uh oh!

vanmilleru commented Apr 4, 2026

Uh oh!

pwilkin commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants