Skip to content

fix(bedrock): preserve thinking blocks in multi-turn tool use#1873

Open
mesutoezdil wants to merge 5 commits into
kagent-dev:mainfrom
mesutoezdil:fix/bedrock-thinking-tool-use
Open

fix(bedrock): preserve thinking blocks in multi-turn tool use#1873
mesutoezdil wants to merge 5 commits into
kagent-dev:mainfrom
mesutoezdil:fix/bedrock-thinking-tool-use

Conversation

@mesutoezdil
Copy link
Copy Markdown
Contributor

@mesutoezdil mesutoezdil commented May 15, 2026

Fixes #1870

When extended thinking is active, Bedrock returns thinking content blocks together with tool use blocks.

The API requires these blocks to be sent back unmodified in the next request.

Without them, Bedrock returns a ValidationException with toolUse.input is empty.

See the API requirement: https://docs.aws.amazon.com/bedrock/latest/userguide/claude-messages-extended-thinking.html

Tests added for thinking block round-trip and temperature/top_p exclusion.

cc @marcofranssen @supreme-gg-gg @EItanya

Copilot AI review requested due to automatic review settings May 15, 2026 14:19
@github-actions github-actions Bot added the bug Something isn't working label May 15, 2026
@mesutoezdil
Copy link
Copy Markdown
Contributor Author

@supreme-gg-gg could you test this with extended thinking and tool use? That is the scenario that was broken.

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels May 15, 2026
@mesutoezdil mesutoezdil force-pushed the fix/bedrock-thinking-tool-use branch from 3eeb362 to 78e2467 Compare May 15, 2026 14:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses Amazon Bedrock (Converse API) extended thinking + tool-use multi-turn behavior by preserving Bedrock “reasoning/thinking” content blocks across turns so subsequent requests validate correctly, and it also adds Helm chart support for exposing controller metrics (service, RBAC, and optional ServiceMonitor).

Changes:

  • Bedrock: parse/accumulate reasoning content blocks (non-streaming + streaming) and round-trip them back into subsequent requests; omit temperature/top_p when thinking is enabled.
  • Helm: add controller.metrics.* values and templates for a dedicated metrics Service, optional ServiceMonitor, and RBAC to support secure metrics scraping.
  • Tests: add unit tests for thinking-block round-trip conversion and add Helm unittest coverage for metrics resources.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
helm/kagent/values.yaml Adds configurable controller.metrics settings (port, secure mode, ServiceMonitor knobs).
helm/kagent/templates/controller-deployment.yaml Exposes an optional metrics container port when metrics are enabled.
helm/kagent/templates/controller-configmap.yaml Wires metrics bind address + secure flag into controller config when enabled.
helm/kagent/templates/controller-metrics-service.yaml Adds a dedicated ClusterIP Service for metrics.
helm/kagent/templates/controller-servicemonitor.yaml Adds an optional Prometheus Operator ServiceMonitor for controller metrics.
helm/kagent/templates/rbac/metrics-auth-role.yaml Adds ClusterRole/Binding for delegated auth (TokenReview/SAR) when secure metrics enabled.
helm/kagent/templates/rbac/metrics-reader-role.yaml Adds ClusterRole allowing GET /metrics when metrics enabled.
helm/kagent/tests/controller-deployment_test.yaml Ensures metrics port + configmap env are rendered when enabled and not otherwise.
helm/kagent/tests/controller-service_test.yaml Ensures main controller Service does not gain extra ports when metrics enabled.
helm/kagent/tests/controller-metrics-service_test.yaml Tests rendering and shape of the dedicated metrics Service.
helm/kagent/tests/controller-servicemonitor_test.yaml Tests ServiceMonitor conditional rendering and TLS settings.
helm/kagent/tests/metrics-rbac_test.yaml Tests metrics RBAC templates render conditions and rule contents.
go/adk/pkg/models/bedrock.go Implements reasoning/thinking block parsing (streaming + non-streaming), round-trip conversion, and inference-config adjustments under thinking.
go/adk/pkg/models/bedrock_test.go Adds a thinking-block conversion test and a minimal thinking/temperature test case.
Comments suppressed due to low confidence (1)

go/adk/pkg/models/bedrock.go:383

  • In streaming mode the final Parts ordering does not preserve the original Bedrock content-block order: all thinking parts are prepended, then all text is concatenated, then all tool calls are appended. Bedrock requires reasoning/tool blocks to be sent back in the same order as returned; reordering by type can break multi-turn tool-use validation. Consider accumulating completed blocks keyed by ContentBlockIndex (including text/tool/thinking) and assembling finalParts by increasing index (or by the exact stop/start sequence) instead of grouping by kind.
	// thinking parts first; block order must match what Bedrock returned.
	finalParts := []*genai.Part{}
	finalParts = append(finalParts, completedThinkingParts...)
	text := aggregatedText.String()
	if text != "" {
		finalParts = append(finalParts, &genai.Part{Text: text})
	}
	// Add completed tool calls
	finalParts = append(finalParts, completedToolCalls...)


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread go/adk/pkg/models/bedrock_test.go Outdated
Comment on lines +463 to +477
func TestThinkingDropsTemperatureAndTopP(t *testing.T) {
cases := []struct {
name string
fields map[string]any
wantDrop bool
}{
{"no thinking", nil, false},
{"thinking key present", map[string]any{"thinking": map[string]any{"type": "enabled", "budget_tokens": 2000}}, true},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, thinkingEnabled := tc.fields["thinking"]
if thinkingEnabled != tc.wantDrop {
t.Errorf("thinkingEnabled = %v, want %v", thinkingEnabled, tc.wantDrop)
}
Comment thread helm/kagent/values.yaml Outdated
Comment on lines +225 to +235
metrics:
enabled: false
port: 8443
# -- Serve metrics over HTTPS with authn/authz, matching the controller binary default.
# Set to false for plain HTTP scraping in environments where TLS is handled externally.
secure: true
serviceMonitor:
enabled: false
interval: 30s
scrapeTimeout: 10s
labels: {}
interval: {{ .Values.controller.metrics.serviceMonitor.interval }}
scrapeTimeout: {{ .Values.controller.metrics.serviceMonitor.scrapeTimeout }}
{{- if .Values.controller.metrics.secure }}
scheme: https
@mesutoezdil mesutoezdil force-pushed the fix/bedrock-thinking-tool-use branch 3 times, most recently from 3c9ede6 to 98103aa Compare May 15, 2026 14:34
@marcofranssen
Copy link
Copy Markdown
Contributor

I was doing an atempt here as well, but couldn't get it to work just yet, allthough there might be parts useful for this PR.

#1872

@mesutoezdil mesutoezdil force-pushed the fix/bedrock-thinking-tool-use branch 2 times, most recently from 0077429 to 942405c Compare May 15, 2026 19:34
When extended thinking is active, Bedrock returns thinking content blocks
together with tool use blocks. The API requires these blocks to be sent
back unmodified in the next request. Without them, Bedrock returns a
ValidationException with toolUse.input is empty.

Only emit thinking blocks for the last assistant turn before tool results.
Sending them in all turns causes token counts to compound across long
sessions.

Truncate tool results in historical turns to 2000 chars. Older turns do
not need full fidelity for large kubectl or YAML outputs.

Wrap the OTLP span exporter to cap string attributes at 16KB, preventing
large tool responses from exceeding Tempo's 4MB gRPC message limit.

Fixes kagent-dev#1870

Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
@mesutoezdil mesutoezdil force-pushed the fix/bedrock-thinking-tool-use branch from 942405c to bf0a8e7 Compare May 15, 2026 19:35
@mesutoezdil
Copy link
Copy Markdown
Contributor Author

I was doing an atempt here as well, but couldn't get it to work just yet, allthough there might be parts useful for this PR.

#1872

Thx! I took a look at #1872 and incorporated the token explosion fix from there.

@EItanya
Copy link
Copy Markdown
Contributor

EItanya commented May 18, 2026

I was doing an atempt here as well, but couldn't get it to work just yet, allthough there might be parts useful for this PR.

#1872

@marcofranssen any chance you can give this PR a smoke test before I review/approve?

mesutoezdil and others added 2 commits May 22, 2026 17:18
Extract inference config construction into a testable helper so that
the temperature/top_p exclusion logic for extended thinking can be
exercised directly without mocking the AWS Bedrock client.

Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
@mesutoezdil mesutoezdil force-pushed the fix/bedrock-thinking-tool-use branch from 06a04b0 to 9ef3656 Compare May 22, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: ModelConfig Bedrock Thinking not working.

4 participants