Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
09425e6
Barebone of new integration
antonpirker Jun 2, 2025
03f7e24
Creating some spans
antonpirker Jun 3, 2025
b24fe91
Removed traceprovider. the hooks work great
antonpirker Jun 3, 2025
2558fd1
cleanup
antonpirker Jun 3, 2025
5ce67e7
use scopes, that what they are meant for.
antonpirker Jun 3, 2025
55f3ea8
Cleanup
antonpirker Jun 3, 2025
0385775
Create transaction for runner.run
antonpirker Jun 3, 2025
2f38a88
Enable our RunHooks automatically
antonpirker Jun 3, 2025
4f25c2e
cleanup
antonpirker Jun 3, 2025
22ed21c
Better naming
antonpirker Jun 4, 2025
be37679
organized code
antonpirker Jun 4, 2025
1a72a92
Set some common data
antonpirker Jun 4, 2025
f28fc61
work on tool span
antonpirker Jun 4, 2025
2e24997
organize code
antonpirker Jun 4, 2025
bba3ef2
organize code
antonpirker Jun 4, 2025
cf06427
organize code
antonpirker Jun 4, 2025
d9acb1e
organize code and set operation name
antonpirker Jun 4, 2025
87f6562
set gen_ai.system
antonpirker Jun 4, 2025
c8a89db
ai client spans
antonpirker Jun 5, 2025
1f21a3b
Added token usage to ai client spans
antonpirker Jun 5, 2025
1e24c10
refactoring
antonpirker Jun 5, 2025
a1763ec
refacotring
antonpirker Jun 5, 2025
5d392f6
order
antonpirker Jun 5, 2025
e7710bc
order
antonpirker Jun 5, 2025
990fae0
better ai client spans
antonpirker Jun 6, 2025
1ac4337
cleanup
antonpirker Jun 6, 2025
08c47e5
moving stuff around
antonpirker Jun 6, 2025
6db1476
moving stuff around
antonpirker Jun 6, 2025
5ca575c
some consistency
antonpirker Jun 6, 2025
c6dbe47
Tool input and output
antonpirker Jun 6, 2025
fd028ef
removed debug output
antonpirker Jun 6, 2025
18f3b41
Using deprecated attr names to make ui work
antonpirker Jun 11, 2025
d64a1de
Updated prompt messages format.
antonpirker Jun 12, 2025
8c7a3dc
better input and output for ai client span
antonpirker Jun 12, 2025
c41a63b
renamed some attributes
antonpirker Jun 13, 2025
2e53aa7
Add available tools to client span
antonpirker Jun 13, 2025
5331ace
add it everywhere where an agent is available
antonpirker Jun 13, 2025
cd1f4ee
made tool_calls array of object
antonpirker Jun 16, 2025
c9b06e4
cleanup
antonpirker Jun 16, 2025
0942b56
cleanup
antonpirker Jun 16, 2025
92f015d
cleanup
antonpirker Jun 16, 2025
78064e7
cleanup
antonpirker Jun 16, 2025
e79d3c7
handle pii
antonpirker Jun 16, 2025
cead2b0
Merge branch 'master' into antonpirker/openai-agents-integration
antonpirker Jun 16, 2025
3ff9242
first version of vibe coded test suite
antonpirker Jun 16, 2025
39c124d
move stuff
antonpirker Jun 16, 2025
20c5343
tests
antonpirker Jun 17, 2025
c31d4a4
Better output
antonpirker Jun 17, 2025
5aa3089
a working test
antonpirker Jun 17, 2025
4ef332f
another test
antonpirker Jun 17, 2025
abe5d05
another test
antonpirker Jun 17, 2025
3a2230d
Setting span origin
antonpirker Jun 17, 2025
90d2cb4
Setting span origin
antonpirker Jun 17, 2025
005e6d0
fixed tests
antonpirker Jun 17, 2025
c5dd40b
cleanup
antonpirker Jun 17, 2025
c6fcd9d
disable openai tracing because it emits a warning when no api_key is set
antonpirker Jun 17, 2025
1a1aa26
updated test matrix
antonpirker Jun 17, 2025
e0fdbcf
cleanup
antonpirker Jun 17, 2025
ed4997b
linting
antonpirker Jun 17, 2025
634ddce
linting
antonpirker Jun 17, 2025
68cdbc4
improved check
antonpirker Jun 17, 2025
82e8a02
better span origin
antonpirker Jun 17, 2025
1aecf4f
linting
antonpirker Jun 17, 2025
b53d599
Merge branch 'master' into antonpirker/openai-agents-integration
antonpirker Jun 17, 2025
66a2960
.
antonpirker Jun 17, 2025
69cd9f3
proper classmethod patching
antonpirker Jun 17, 2025
650ed23
Merge branch 'master' into antonpirker/openai-agents-integration
antonpirker Jun 17, 2025
a8c6415
better invocation spans
antonpirker Jun 17, 2025
0f97113
proper invocation spans
antonpirker Jun 17, 2025
99ccbee
cleanup
antonpirker Jun 17, 2025
5c59668
Bring back handoff span
antonpirker Jun 18, 2025
269802b
Bring back handoff span 2
antonpirker Jun 18, 2025
3885fb4
Bring back invoke agent spans based on hooks.
antonpirker Jun 18, 2025
a2b4d81
explanation of defeat
antonpirker Jun 18, 2025
afe1a17
updated test matrix
antonpirker Jun 18, 2025
43fffe7
linting
antonpirker Jun 18, 2025
145653e
Updates for version 0.0.19
antonpirker Jun 18, 2025
19c81f2
Better hooking into agent invokation
antonpirker Jun 23, 2025
8b12801
Typing
antonpirker Jun 23, 2025
d7198e2
Fixed handoff spans
antonpirker Jun 23, 2025
3f48a1c
tests for handoff span
antonpirker Jun 23, 2025
6e7fbdd
Merge branch 'master' into antonpirker/openai-agents-integration
antonpirker Jun 23, 2025
cebe0f4
cleanup
antonpirker Jun 23, 2025
efe9e39
Apply suggestions from code review
antonpirker Jun 24, 2025
ff760d1
more resilient monkey patching
antonpirker Jun 24, 2025
087bbeb
removed unused function
antonpirker Jun 24, 2025
b55c1ce
removed unused code
antonpirker Jun 24, 2025
87adffd
comment
antonpirker Jun 24, 2025
c6e8230
made finishing spans more resilient
antonpirker Jun 24, 2025
f054a8f
better check
antonpirker Jun 24, 2025
b4f164e
Make sure to never send objects, but rather strings
antonpirker Jun 24, 2025
fbd23c2
typing
antonpirker Jun 24, 2025
7b27d22
Merge branch 'master' into antonpirker/openai-agents-integration
antonpirker Jun 24, 2025
c4bfd0b
merge stuff
antonpirker Jun 24, 2025
4e4f39c
update test matrix
antonpirker Jun 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fixed handoff spans
  • Loading branch information
antonpirker committed Jun 23, 2025
commit d7198e2f5bf9332f196b5dffbe3a8c840b814e93
10 changes: 8 additions & 2 deletions sentry_sdk/integrations/openai_agents/patches/agent_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from sentry_sdk.integrations import DidNotEnable

from ..spans import invoke_agent_span, update_invoke_agent_span
from ..spans import invoke_agent_span, update_invoke_agent_span, handoff_span

from typing import TYPE_CHECKING

Expand Down Expand Up @@ -121,7 +121,13 @@ async def patched_execute_handoffs(
run_config,
):
# type: (Any, agents.Agent, Any, list[Any], list[Any], Any, list[Any], list[Any], agents.RunContextWrapper, agents.RunConfig) -> Any
"""Patched execute_handoffs that ends agent span for handoffs"""
"""Patched execute_handoffs that creates handoff spans and ends agent span for handoffs"""

# Create Sentry handoff span for the first handoff (agents library only processes the first one)
if run_handoffs:
first_handoff = run_handoffs[0]
handoff_agent_name = first_handoff.handoff.agent_name
handoff_span(context_wrapper, agent, handoff_agent_name)

# Call original method with all parameters
result = await original_execute_handoffs(
Expand Down
4 changes: 2 additions & 2 deletions sentry_sdk/integrations/openai_agents/spans/handoff.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@
import agents


def handoff_span(context, from_agent, to_agent):
def handoff_span(context, from_agent, to_agent_name):
# type: (agents.RunContextWrapper, agents.Agent, agents.Agent) -> None
with sentry_sdk.start_span(
op=OP.GEN_AI_HANDOFF,
name=f"handoff from {from_agent.name} to {to_agent.name}",
name=f"handoff from {from_agent.name} to {to_agent_name}",
origin=SPAN_ORIGIN,
) as span:
span.set_data(SPANDATA.GEN_AI_OPERATION_NAME, "handoff")
223 changes: 223 additions & 0 deletions tests/integrations/openai_agents/test_openai_agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,78 @@ def test_agent():
)


@pytest.fixture
def handoff_agents():
"""Create two agents where the first can handoff to the second"""

secondary_agent = Agent(
name="secondary_agent",
instructions="You are a specialist assistant.",
model="gpt-4o-mini",
model_settings=ModelSettings(
temperature=0.5,
max_tokens=150,
),
)

primary_agent = Agent(
name="primary_agent",
instructions="You are a primary assistant that can handoff to specialists.",
model="gpt-4",
handoffs=[secondary_agent],
model_settings=ModelSettings(
temperature=0.7,
max_tokens=100,
),
)

return primary_agent, secondary_agent


@pytest.fixture
def mock_handoff_responses(mock_usage):
"""Mock responses that simulate a handoff scenario"""

# First response: primary agent calls handoff tool
handoff_response = ModelResponse(
output=[
ResponseFunctionToolCall(
id="call_handoff_123",
call_id="call_handoff_123",
name="transfer_to_secondary_agent",
type="function_call",
arguments="{}",
function=MagicMock(name="transfer_to_secondary_agent", arguments="{}"),
)
],
usage=mock_usage,
response_id="resp_handoff_123",
)

# Second response: secondary agent produces final output
final_response = ModelResponse(
output=[
ResponseOutputMessage(
id="msg_final",
type="message",
status="completed",
content=[
ResponseOutputText(
text="I'm the specialist and I can help with that!",
type="output_text",
annotations=[],
)
],
role="assistant",
)
],
usage=mock_usage,
response_id="resp_final_123",
)

return [handoff_response, final_response]


@pytest.mark.asyncio
async def test_agent_invocation_span(
sentry_init, capture_events, test_agent, mock_model_response
Expand Down Expand Up @@ -180,6 +252,157 @@ def test_agent_invocation_span_sync(
assert ai_client_span["data"]["gen_ai.request.top_p"] == 1.0


@pytest.mark.asyncio
async def test_handoff_span(
sentry_init, capture_events, handoff_agents, mock_handoff_responses
):
"""
Test that the integration creates spans for agent handoffs.
"""
primary_agent, secondary_agent = handoff_agents

with patch.dict(os.environ, {"OPENAI_API_KEY": "test-key"}):
with patch(
"agents.models.openai_responses.OpenAIResponsesModel.get_response"
) as mock_get_response:

# First, let's see what handoff tools are actually available
# by mocking get_response to capture the tools
def debug_get_response(*args, **kwargs):
tools = kwargs.get("tools", [])
handoffs = kwargs.get("handoffs", [])
print(
f"Tools available: {[tool.name if hasattr(tool, 'name') else str(tool) for tool in tools]}"
)
print(
f"Handoffs available: {[h.tool_name if hasattr(h, 'tool_name') else str(h) for h in handoffs]}"
)

# Find the correct handoff tool name
handoff_tool_name = None
for handoff in handoffs:
if (
hasattr(handoff, "tool_name")
and "secondary_agent" in handoff.tool_name
):
handoff_tool_name = handoff.tool_name
break

if not handoff_tool_name:
# Fallback - look for any handoff tool
for handoff in handoffs:
if hasattr(handoff, "tool_name"):
handoff_tool_name = handoff.tool_name
break

print(f"Using handoff tool name: {handoff_tool_name}")

# Create corrected handoff response with the right tool name
corrected_handoff_response = ModelResponse(
output=[
ResponseFunctionToolCall(
id="call_handoff_123",
call_id="call_handoff_123",
name=handoff_tool_name, # Use the correct tool name
type="function_call",
arguments="{}",
function=MagicMock(name=handoff_tool_name, arguments="{}"),
)
],
usage=Usage(
requests=1,
input_tokens=10,
output_tokens=20,
total_tokens=30,
input_tokens_details=MagicMock(cached_tokens=0),
output_tokens_details=MagicMock(reasoning_tokens=5),
),
response_id="resp_handoff_123",
)

# Return the corrected response for the first call
debug_get_response.call_count += 1
if debug_get_response.call_count == 1:
return corrected_handoff_response
else:
# Second call - return final response from secondary agent
return mock_handoff_responses[1]

debug_get_response.call_count = 0
mock_get_response.side_effect = debug_get_response

sentry_init(
integrations=[OpenAIAgentsIntegration()],
traces_sample_rate=1.0,
)

events = capture_events()

result = await agents.Runner.run(
primary_agent,
"Please help me with this specialist task",
run_config=test_run_config,
)

assert result is not None
assert result.final_output == "I'm the specialist and I can help with that!"

(transaction,) = events
spans = transaction["spans"]

# Remove the debugger import
# import ipdb; ipdb.set_trace()
print(f"Number of spans: {len(spans)}")
for i, span in enumerate(spans):
print(f"Span {i}: {span['op']} - {span['description']}")

# Should have: primary invoke, primary ai_client, handoff, secondary invoke, secondary ai_client
assert len(spans) >= 3

# Find the spans more flexibly
handoff_span = None
invoke_agent_spans = []
ai_client_spans = []

for span in spans:
if span["op"] == "gen_ai.handoff":
handoff_span = span
elif span["op"] == "gen_ai.invoke_agent":
invoke_agent_spans.append(span)
elif span["op"] == "gen_ai.chat":
ai_client_spans.append(span)

# Verify transaction
assert transaction["transaction"] == "primary_agent workflow"
assert transaction["contexts"]["trace"]["origin"] == "auto.ai.openai_agents"

# Verify handoff span exists and has correct properties
assert handoff_span is not None
assert handoff_span["description"] == "handoff primary_agent -> secondary_agent"
assert handoff_span["data"]["gen_ai.operation.name"] == "handoff"
assert handoff_span["data"]["gen_ai.system"] == "openai"
assert handoff_span["data"]["gen_ai.agent.from"] == "primary_agent"
assert handoff_span["data"]["gen_ai.agent.to"] == "secondary_agent"

# Verify both agent invoke spans exist
assert len(invoke_agent_spans) == 2

primary_invoke_span = next(
(s for s in invoke_agent_spans if "primary_agent" in s["description"]), None
)
secondary_invoke_span = next(
(s for s in invoke_agent_spans if "secondary_agent" in s["description"]), None
)

assert primary_invoke_span is not None
assert primary_invoke_span["description"] == "invoke_agent primary_agent"
assert primary_invoke_span["data"]["gen_ai.agent.name"] == "primary_agent"

assert secondary_invoke_span is not None
assert secondary_invoke_span["description"] == "invoke_agent secondary_agent"
assert secondary_invoke_span["data"]["gen_ai.agent.name"] == "secondary_agent"


@pytest.mark.asyncio
async def test_tool_execution_span(sentry_init, capture_events, test_agent):
"""
Expand Down