Skip to content

Interactive eval TUI should list available models by provider #520

@christso

Description

@christso

Objective

Add a provider-aware model selection step to AgentV's current interactive eval flow so users can choose a valid model for the selected target without already knowing provider-specific model strings.

Current Ground Truth

AgentV's current interactive flow is an Inquirer-based wizard that lets users:

  • discover/select eval files
  • choose a target
  • set advanced options like workers, dry-run, and cache
  • review and confirm execution

It does not currently prompt for model override or provider-specific model selection.

Problem

When a selected target/provider supports multiple models, users still need to know the correct model string ahead of time and pass it outside the interactive flow.

This causes a few issues:

  • trial and error with invalid model names
  • provider-specific confusion about whether a string is a model name or a reasoning preset
  • a gap between target selection and model configuration in the interactive flow

Proposed behavior

When the interactive flow reaches target configuration, AgentV should:

  • detect the selected provider/target type
  • query or enumerate the available models for that provider when possible
  • show a selectable list in the interactive flow
  • still allow manual entry as a fallback
  • preserve current behavior for providers where listing is unavailable

Nice-to-have

  • indicate deprecated models
  • indicate recommended/default models
  • show reasoning-related settings separately from model selection when the provider supports them
  • cache provider model lists for a short period to keep the flow responsive

Why this matters

This would make the interactive flow much more discoverable and reduce provider-specific guesswork, especially for Codex, Copilot, and similar agent providers where supported models evolve over time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions