Skip to content

Python: include tool definitions for Foundry agent evals#5974

Merged
alliscode merged 1 commit into
microsoft:mainfrom
he-yufeng:fix/foundry-evals-tool-definitions-mapping
May 21, 2026
Merged

Python: include tool definitions for Foundry agent evals#5974
alliscode merged 1 commit into
microsoft:mainfrom
he-yufeng:fix/foundry-evals-tool-definitions-mapping

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

Summary

  • add a separate Foundry evaluator set for evaluators that accept tool_definitions in data mappings
  • include tool_definitions only when the uploaded eval items actually contain tool definitions
  • cover agent evaluators such as task_adherence, task_completion, intent_resolution, and task_navigation_efficiency

To verify

  • uv run pytest packages/foundry/tests/test_foundry_evals.py -q -p no:cacheprovider
  • uv run ruff check packages/foundry/agent_framework_foundry/_foundry_evals.py packages/foundry/tests/test_foundry_evals.py
  • uv run mypy --config-file packages/foundry/pyproject.toml packages/foundry/agent_framework_foundry/_foundry_evals.py
  • git diff --check

Fixes #5918

Copilot AI review requested due to automatic review settings May 20, 2026 15:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds conditional inclusion of tool_definitions in Foundry eval criteria data mappings, including support for agent evaluators that accept tool definitions, and updates tests accordingly.

Changes:

  • Introduce _TOOL_DEFINITION_EVALUATORS to broaden which evaluators can receive tool_definitions mappings.
  • Add include_tool_definitions flag to _build_testing_criteria and wire it from dataset evaluation based on tool presence.
  • Expand tests to assert tool_definitions mapping is included when enabled for tool + agent evaluators.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
python/packages/foundry/agent_framework_foundry/_foundry_evals.py Adds evaluator set + gating flag so tool_definitions is only mapped when items include tools.
python/packages/foundry/tests/test_foundry_evals.py Updates criteria-builder tests to pass include_tool_definitions=True and adds an agent-evaluator assertion.

Comment thread python/packages/foundry/agent_framework_foundry/_foundry_evals.py
Comment thread python/packages/foundry/tests/test_foundry_evals.py
@moonbox3
Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/foundry/agent_framework_foundry
   _foundry_evals.py250498%456, 461, 641, 706
TOTAL35212410288% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
6991 30 💤 0 ❌ 0 🔥 1m 52s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: FoundryEvals MissingRequiredDataMapping for tool_definitions on task_adherence and other agent evaluators

5 participants