Skip to content

Conversation

@simonpcouch
Copy link
Owner

thinking.mov

Started out with:

I'd like to add support for a thinking mode in this application. It should work like this:

  • Ctrl+T to toggle thinking, with default thinking off. When the setting is toggled, a small piece of text should appear to the lower right side of the chat input box saying "Thinking on" or "Thinking off". The text should be left-aligned, with it's right-hand side near the right edge of the chat input box.
    • "Thinking on" should be green. After a few seconds, it should fade to grey and read "Thinking on (Ctrl+T)"
    • Thinking off should be grey and fade to "Thinking off (Ctrl+T)"
  • When thinking is on, introduce a line prefixing each user message (including tool call results) reading "If you want to stop for a moment and consider your next steps, you can use <think> tags at the beginning of your message. No need to use that tag if you don't want to. Keep it very brief." That line should be "ephemeral" in the sense that it's not actually included in the conversation history—just include it in what's shown to the model but don't include it (nor the thinking output) in what's actually sent to the model from then-on. It might still need to be logged in the conversation history in order to be shown correctly in the UI.
  • In ellmer, disable thinking/reasoning. This should at least work for the big three providers: Anthropic, OpenAI, and Gemini.
  • Process the provided clients to disable native thinking/reasoning inside of ellmer.
  • The thinking output should be streamed into the UI inside of a collapsible as greyed-out, italicized text at .9 size of usual response text. After one line, the remaining text shouldn't be visible in the UI unless the user clicks the dropdown. Otherwise, make the UI sleek and minimal. While the model is still streaming thinking text, make the thinking text line "shimmer" until the think tag is closed.

Please research necessary files and package documentation and make a plan to implement this in plans/. Then, implement the plan.


A few issues:

  • This is Claude Sonnet 4.5. Gemini also comfortable using this format, but OpenAI models (4.1, 5.1, 5.2) seem to want nothing to do with the tags, including with reasoning disabled/enabled.
  • Loading in a saved (interrupted?) chat results in the instruction not being stripped.
  • I think you have to have the chat input selected to toggle to Thinking On/Off, currently.
  • I haven't looked into how the thinking instruction is actually made ephemeral.

Want to try and implement native thinking + ContextText processing and see how different this feels.

@simonpcouch
Copy link
Owner Author

Thinking about an interface to "native" thinking...

  • For thinking on/off for those three providers, the provider/Chat object would be modified in a way to turn
    thinking on or off. Then, the thinking content would be shown to the user in the same way.
  • Each provider would have a toggle_thinking method that would set the arguments accordingly. This would allow users of
    non-big-three labs' models/providers to use the feature by defining their own toggle_thinking method.
  • Remove the ephemeral prompting + method anymore. You can remove that, but preserve the toggle UI and
    streaming UI.
  • There'd need to be some sort of model-level validation. e.g. GPT 4.1 does not support thinking but 5.0+ does.

@simonpcouch
Copy link
Owner Author

2447bc5 transitions from an ad-hoc tag (provider-agnostic) to native reasoning support (only for the big three labs). With the tag, it was difficult to get thinking that felt like it scaled to the scope of the problem; both easy and hard things got a few sentences each. For the native reasoning support, there's much more variation, and it feels better-scoped to the problem at hand.

thinking.mov

Still have an issue where all thinking blocks stream into the
top of the message rather than interleaved throughout, both when streaming and in replay. Instead, we want e.g. if the model chooses to think more than once, we show the thinking block interleaved with the other content exactly where it happened.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants