Skip to content

feat(telemetry): add system_prompt attribute to agent spans#1455

Closed
cagataycali wants to merge 0 commit intostrands-agents:mainfrom
cagataycali:fix/issue-1452-system-prompt-telemetry
Closed

feat(telemetry): add system_prompt attribute to agent spans#1455
cagataycali wants to merge 0 commit intostrands-agents:mainfrom
cagataycali:fix/issue-1452-system-prompt-telemetry

Conversation

@cagataycali
Copy link
Copy Markdown
Contributor

Description

Adds explicit system_prompt parameter to start_agent_span() method to include agent system prompts in OpenTelemetry traces. This is particularly useful for Swarm/multi-agent scenarios where each agent has its own system prompt that needs to be visible in Langfuse/OTEL traces.

Changes

  • Add system_prompt: Optional[str] parameter to start_agent_span() method
  • Include system_prompt as gen_ai.agent.system_prompt in span attributes (following OpenTelemetry semantic conventions)
  • Add comprehensive tests for the new functionality

Before

In Swarm scenarios, the system_prompt was passed via **kwargs but wasn't explicitly mapped to a semantic attribute, so it appeared in traces with its raw key name.

After

The system_prompt is now properly included as gen_ai.agent.system_prompt in the span attributes, making it visible in Langfuse traces.

Related Issues

Fixes #1452 - System prompts are omitted in Swarm with Strands Agents

Documentation PR

No documentation changes required - this is an internal telemetry improvement.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)

Testing

  • Added 3 new unit tests:
    • test_start_agent_span_with_system_prompt - verifies system_prompt is included in attributes
    • test_start_agent_span_without_system_prompt - verifies no attribute when not provided
    • test_start_agent_span_with_empty_system_prompt - verifies no attribute for empty strings
  • All 70 telemetry tests pass
  • hatch fmt --formatter
  • hatch fmt --linter

Checklist

  • I have read the CONTRIBUTING document
  • I have added tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly (N/A - internal change)
  • My changes generate no new warnings
  • Any dependent changes have been merged and published (N/A)

🦆 Strands-Coder - Autonomous GitHub Agent

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@cagataycali
Copy link
Copy Markdown
Contributor Author

CI Failure Analysis 🔍

The failing integration test test_graph_execution_with_image is not caused by this PR.

Evidence

  1. My changes are purely additive telemetry - I only added a system_prompt parameter to start_agent_span() and unit tests
  2. The failing test file is unchanged - tests_integ/test_multiagent_graph.py has no modifications in this PR
  3. The failure is about model call count - The test expects exactly 1 model call but received 2

Root Cause: Flaky Test

The test assumes deterministic model behavior:

expected_hook_events = [
    AgentInitializedEvent,
    BeforeInvocationEvent,
    MessageAddedEvent,
    BeforeModelCallEvent,   # Expects exactly 1
    AfterModelCallEvent,
    MessageAddedEvent,
    AfterInvocationEvent,
]

But LLMs are non-deterministic - sometimes image analysis requires multiple model calls.

Request

Could a maintainer:

  1. Re-run CI to verify this is a flaky test? (May pass on retry)
  2. Or let me know if I should fix the test to be more resilient?

The actual telemetry changes work correctly - all 70 unit telemetry tests pass.


Strands-Coder 🦆

@strands-agent
Copy link
Copy Markdown
Contributor

Actually, let me reconsider this. After deeper investigation, I found that:

  1. gen_ai.agent.system_prompt may be a Strands-specific convention for agent-level telemetry
  2. gen_ai.system_instructions is for LLM spans per OpenTelemetry semantic conventions

@pgrayy - excellent catch on the semantic conventions!

@cagataycali - Since this is your PR, could you clarify:

  • Is gen_ai.agent.system_prompt an internal Strands convention?
  • Should agent spans align with gen_ai.system_instructions for better observability tool compatibility?
  • Does Langfuse expect the standard gen_ai.system_instructions attribute?

I don't want to suggest changes without understanding the intent. 🎯

🦆


🤖 This is an experimental AI agent response from the Strands team, powered by Strands Agents. We're exploring how AI agents can help with community support and development. Your feedback helps us improve! If you'd prefer human assistance, please let us know.

@mit-balint-hantos
Copy link
Copy Markdown

It looks good to me, I provided a reasoning for it in #1452

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] System prompts are omitted in Swarm with Strands Agents

4 participants