-
Notifications
You must be signed in to change notification settings - Fork 3.3k
RAI evaluators return empty {} due to incorrect evaluator_id sent to Foundry sync_evals endpoint #46518
Copy link
Copy link
Open
Labels
EvaluationIssues related to the client library for Azure AI EvaluationIssues related to the client library for Azure AI Evaluationcustomer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as thatThe issue doesn't require a change to the product in order to be resolved. Most issues start as that
Metadata
Metadata
Labels
EvaluationIssues related to the client library for Azure AI EvaluationIssues related to the client library for Azure AI Evaluationcustomer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as thatThe issue doesn't require a change to the product in order to be resolved. Most issues start as that
Describe the bug
When calling any RAI-backed evaluator (e.g. ViolenceEvaluator, ContentSafetyEvaluator, SexualEvaluator, etc.) against an Azure AI Foundry project, the SDK returns an empty dictionary {} with no exception, warning, or log.
The same project and data work correctly when using the Foundry portal “Run evaluation” flow, indicating a mismatch between the SDK request and what the Foundry service expects.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The evaluator should return a populated result, e.g.:
{'violence': 'Medium', 'violence_score': 5, 'violence_reason': "The assistant's response explicitly provides instructions on how to commit a violent act, which constitutes incitement of physical violence. This falls under severity level 5 as it involves providing advice or recommendations on performing a violent act against an individual.", 'violence_total_tokens': 3806, 'violence_result': 'fail'}Actual behavior
{}Root cause analysis (from SDK source – v1.16.5)
evaluator_name = f"builtin.{metric_name}"Example: builtin.violence
azureai://built-in/evaluators/violenceto the same endpoint (confirmed via portal run logs).Workaround
Passing _use_legacy_endpoint=True when constructing any RAI evaluator makes the evaluator work correctly.
ViolenceEvaluator( credential=cred, azure_ai_project=project_url, _use_legacy_endpoint=True )