fix(eval): emit string justification for coded evaluators by ameyjain · Pull Request #1709 · UiPath/uipath-python

ameyjain · 2026-06-10T21:21:28Z

Problem

Coded evaluators (tool-call count/args/order/output, json-similarity) returned a null justification in their score output, while the LLM-judge evaluators returned a populated one.

Root cause: each coded evaluator stored its explanation under a per-evaluator key (explained_tool_calls_count, explained_tool_calls_args, explained_tool_calls_outputs, lcs, matched_leaves/total_leaves) and never under a justification key. The downstream eval worker reads details["justification"], which therefore always resolved to null — even though the structured detail was present.

Fix

Add a computed justification: str field to each of the five coded justification models, derived from the model's existing structured detail via a shared format_explained_tool_calls helper. model_dump() now emits a string justification for every coded evaluator, matching LLMJudgeJustification, without changing any of the structured fields.

Evaluator	`justification` now derived from
tool-call-count	`explained_tool_calls_count`
tool-call-args	`explained_tool_calls_args`
tool-call-output	`explained_tool_calls_outputs`
tool-call-order	`lcs`
json-similarity	`matched_leaves` / `total_leaves`

The base BaseEvaluatorJustification is intentionally left untouched: adding a computed justification there collides with LLMJudgeJustification's real justification field (pydantic raises TypeError). So the computed field lives on each coded subclass instead.

Verification

pytest tests/evaluators tests/cli/eval — all pass
mypy — clean
ruff check + ruff format — clean

Coded evaluators (tool-call count/args/output/order, json-similarity) stored their explanation under per-evaluator keys (explained_tool_calls_*, lcs, matched_leaves) with no 'justification' key, so the eval worker's d.get('justification') always resolved to null while the structured detail was still present. Add a computed 'justification' string field to each coded justification model, derived from its existing structured detail via a shared format_explained_tool_calls helper. model_dump() now emits a string justification for every evaluator, matching LLMJudgeJustification, without changing the structured fields or the worker.

sonarqubecloud · 2026-06-10T21:24:34Z

Quality Gate failed

Failed conditions
67.7% Coverage on New Code (required ≥ 90%)

See analysis details on SonarQube Cloud

github-actions Bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-integrations labels Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(eval): emit string justification for coded evaluators#1709

fix(eval): emit string justification for coded evaluators#1709
ameyjain wants to merge 1 commit into
mainfrom
fix/evaluator-justification-string

ameyjain commented Jun 10, 2026

Uh oh!

sonarqubecloud Bot commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ameyjain commented Jun 10, 2026

Problem

Fix

Verification

Related

Uh oh!

sonarqubecloud Bot commented Jun 10, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant