Skip to content

feat: model/effort overrides + models list discovery command#49

Merged
manojp99 merged 41 commits into
mainfrom
feat/model-overrides-and-discovery
Jun 12, 2026
Merged

feat: model/effort overrides + models list discovery command#49
manojp99 merged 41 commits into
mainfrom
feat/model-overrides-and-discovery

Conversation

@manojp99

Copy link
Copy Markdown
Collaborator

Summary

Adds two capabilities to amplifier-agent so model selection becomes a first-class, externally-controllable concern:

(A) Override pathrun accepts --model and --effort flags that pass through the TypeScript wrapper into the Python engine, overriding the catalog defaults in PROVIDER_CATALOG.

(B) Discovery path — New amplifier-agent models list --provider X subcommand that calls the Provider protocol's list_models() method and returns the result as envelope-wrapped JSON or a 4-column table. Plus a TypeScript wrapper helper listModels() for consumers.

Background

Today amplifier-agent never lets a caller choose a model. Model strings are hardcoded in PROVIDER_CATALOG (provider_sources.py), and downstream consumers (e.g. nanoclaw) have no way to:

  • ask for a different model on a per-run basis, or
  • enumerate what models a provider offers.

This unblocks downstream wiring without committing to any specific consumer-side design. The discovery half ports an existing reference implementation from amplifier_app_cli/provider_loader.py rather than inventing a new mechanism.

Changes

Change Set 1 — Override path

  • wrappers/typescript/src/argv-builder.tsAssembleArgvInput gains modelOverride? and effortOverride?; assembleArgv() emits --model / --effort flags.
  • src/amplifier_agent_cli/provider_sources.pybuild_provider_entry() and inject_provider() accept and apply both overrides.
  • src/amplifier_agent_cli/modes/single_turn.pyrun Click command gains --model and --effort options, threading to inject_provider().

Change Set 2 — Discovery path

  • New: src/amplifier_agent_cli/admin/models.pymodels list --provider <name> [--output json|table] [--timeout N] Click subcommand. Ports provider-loading helpers from amplifier_app_cli/provider_loader.py.
  • src/amplifier_agent_cli/__main__.py — registers models group alongside existing admin commands.

TS wrapper helper

  • New: wrappers/typescript/src/list-models.tslistModels({ provider, timeoutMs?, binaryPath?, env? }) returns parsed ModelsListEnvelope (or throws typed ListModelsError carrying exitCode and stderr). Spawns the Python CLI as a subprocess with full lifecycle, timeout, and error handling.
  • wrappers/typescript/src/index.ts — exports listModels, ListModelsError, and types.

Output schema (envelope-wrapped JSON)

{
  "schema_version": 1,
  "provider": "anthropic",
  "fetched_at": "2026-06-10T17:36:53Z",
  "models": [
    {
      "id": "claude-sonnet-4-5",
      "display_name": "Claude Sonnet 4.5",
      "context_window": 200000,
      "max_output_tokens": 8192,
      "capabilities": ["tools", "vision", "thinking"],
      "defaults": {"temperature": 0.7, "max_tokens": 8192}
    }
  ]
}

Table view filters to 4 columns: ID, DISPLAY NAME, CONTEXT, CAPABILITIES.

Exit codes (models list)

Exit Condition
0 Success — including a legitimately empty list (azure-openai, ollama-down)
1 Usage error — unknown provider, bad flag
2 Provider error — list_models() raised (auth, network, timeout)

No fallback. list_models() is a list API; if it raises, the error propagates and the command exits 2. Consumers decide what to do.

Tests

  • Python: 159 passed, 1 skipped (pre-existing Linux-only /proc test). New: 5 cases in tests/cli/test_provider_sources.py, ~7 cases in tests/cli/test_single_turn.py (including engine-boot integration), 13 cases in tests/cli/test_admin_models.py (new).
  • TypeScript: 114/114 passing across 21 files. New: 4 cases in argv-builder.test.ts for the new flags, 9 cases in list-models.test.ts (new — mocks node:child_process).
  • Linting: ruff check src/ tests/ clean, ruff format --check clean (140 files).
  • Build: tsc --noEmit clean, npm run build clean.

Known items (follow-ups, not blocking)

  1. Pre-existing pyright error in src/amplifier_agent_cli/modes/single_turn.py:674 (JsonDisplaySystem | CliDisplaySystem vs CliDisplaySystem). Confirmed pre-existing on main at the same code path (line 664). NOT introduced by this PR. Recommend separate follow-up.
  2. NIT: listModels() uses child.on("exit") instead of "close". For payloads larger than the 64KB pipe buffer, "close" is the strictly-correct event for stream-flush guarantee. Current model lists are <100KB so this is benign in practice.
  3. NIT: Test fix: Add Microsoft OSS compliance files and README sections #2 in list-models.test.ts (custom binary path) verifies argv but doesn't assert on the resolved envelope. Easy upgrade.
  4. NIT: Could add an explicit test for the error event path (ENOENT — missing binary) that the implementer added defensively beyond the spec's matrix.

Out of scope (intentional — design decisions, not omissions)

  • No nanoclaw changes — downstream work, blocked on this PR landing.
  • No provider module changes — all four already implement list_models() correctly.
  • No real-API integration tests — provider behaviors exercised via mocked FakeProvider.
  • No models default subcommand — explicitly dropped during design (consumer can hardcode catalog defaults if needed).
  • No --source live|catalog flag on models listlist_models() is a list API; failures propagate, consumer decides.
  • No boot-time validation that --model belongs to chosen provider — pass-through; provider rejects invalid id at its first API call with its own error.

Design + plan

  • Full design with locked decisions and alternatives considered: docs/designs/model-overrides-and-discovery.md
  • TDD implementation plan: docs/plans/2026-06-10-model-overrides-and-discovery-implementation.md

Process notes

This PR was developed via the Superpowers SDD recipe:

  • 17 tasks implemented via fresh-agent-per-task TDD (RED → GREEN → COMMIT)
  • Each task reviewed by spec-compliance reviewer and code-quality reviewer (with iteration where needed)
  • Holistic final review identified style fixes (ruff formatting, f-strings, explicit None check) — applied in cf5c1a1
  • The TS wrapper listModels() addition was a follow-up, separately TDD-implemented and code-quality reviewed (verdict: APPROVED WITH NITS).

🤖 Generated with Amplifier

Co-Authored-By: Amplifier 240397093+microsoft-amplifier@users.noreply.github.com

Manoj Prabhakar Paidiparthy and others added 30 commits June 10, 2026 19:24
Captures the validated design for two amplifier-agent CLI capabilities:
- run --model / --effort override path (wrapper -> engine -> provider config)
- models list subcommand calling Provider.list_models() (no fallback)

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Add `--model` and `--effort` CLI options to the `run` command that thread
through `_TurnSpec` fields `model_override` and `effort_override` and are
forwarded to `inject_provider` so they propagate to the provider module.

- Append `model_override` and `effort_override` fields to `_TurnSpec`
- Pass both fields to `inject_provider` in `_execute_turn`
- Add `@click.option("--model")` and `@click.option("--effort")` decorators
- Extend `run()` signature with the two new parameters
- Pass both kwargs to the `_TurnSpec()` constructor

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…ride

Adds test_run_override_lands_in_mount_plan to test_single_turn.py.

Exercises real _execute_turn -> real inject_provider with engine boot faked
(load_and_prepare_cached, Engine, make_turn_handler). Verifies that Edit 3b
(inject_provider forwarding model_override/effort_override) correctly wires
CLI flags into prepared.mount_plan['providers'][0]['config'].

TDD verification: temporarily reverted Edit 3b and confirmed the test fails
with default_model='claude-opus-4-5' instead of 'claude-sonnet-4-5'. Restored
Edit 3b and confirmed PASS.

Change Set 1 safety net: this test guards against accidental regression of
inject_provider forwarding.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Port load_provider_class from provider_loader.py into admin/models.py.
Resolves provider class by convention ({Name}Provider), with fallback
to __all__ and dir(module) iteration. Returns None on ImportError (no
raise). Ported with line length 120 formatting.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
- Wire models list --output json to emit JSON envelope with schema_version,
  provider, fetched_at, and models[] containing model_dump fields
- Add unknown-provider guard via ClickException with known providers list
- Resolve 'auto' output mode to 'table' on TTY, 'json' when piped
- Add _render_json helper building the envelope payload
- Add temporary _render_table stub (real renderer lands in Task 13)
- Add SCHEMA_VERSION = 1 module constant
- Add test_models_list_json_envelope_shape covering the full envelope shape

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Replace the _render_table stub with a real renderer that outputs
four aligned columns (ID, DISPLAY NAME, CONTEXT, CAPABILITIES).
Column widths are computed as the max of header length and the
longest cell value; cells are joined with two spaces and trailing
whitespace is stripped.  Capabilities are comma-joined.

Adds test_models_list_table_columns which asserts headers and data
values are present in --output table output.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
When list_provider_models() raises an exception, catch it in models_list,
echo an error message to stderr (# <provider>: list_models() failed: <ExcType>: <msg>),
and call sys.exit(2). This ensures no structured JSON leaks to stdout and the
exit code is 2 (distinguishable from Click's default exit 1 for exceptions).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Adds test_models_list_unknown_provider_exits_1 which:
- invokes cli ['models', 'list', '--provider', 'not-a-provider']
- asserts exit_code == 1 (click.ClickException exits 1, not 2)
- asserts 'not-a-provider' in stderr

Regression detection confirmed: commenting out the PROVIDER_CATALOG guard
causes the test to fail (command proceeds, exits 0 with advisory), proving
the test validates the correct exit-code contract.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…licit None check

- ruff format test_provider_sources.py: add missing blank line before
  test_inject_provider_forwards_model_override (E302)
- ruff check --fix test_admin_models.py: convert 3 .format() calls to
  f-strings at lines 220-224 (UP032)
- provider_sources.py:175: replace falsy 'model_override or ...' with
  explicit 'model_override if model_override is not None else ...'
  to match stated str|None contract and be consistent with the
  effort_override guard on line 178

Tests: 159 passed, 1 skipped (no regressions)

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…uilder dist

- tests/cli/test_single_turn.py: sort local imports (ruff I001) + format
- wrappers/typescript/dist/argv-builder.{js,d.ts}: rebuild to include
  modelOverride/effortOverride emission added in a9b2f9b/2e2a631

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
9 vitest cases exercise the wrapper-side counterpart to the
'amplifier-agent models list --output json' Python subcommand:
happy path, custom binary, empty list (azure-openai), provider/usage
error exits, timeout, malformed JSON, schema_version mismatch, env
forwarding. node:child_process is mocked via vi.mock so spawned
processes are fully simulated.

This commit is the RED step: tests will not compile until
src/list-models.ts ships in the follow-up.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Adds a wrapper-side counterpart to the Python 'amplifier-agent models list
--output json' subcommand. listModels() spawns the engine binary, decodes
the JSON envelope, and exposes failure modes through a typed ListModelsError
carrying exitCode and stderr so callers can distinguish auth vs network
vs usage errors.

Public surface (re-exported from index.ts):
  - listModels()         — async (params) => ModelsListEnvelope
  - ListModelsError      — Error subclass with exitCode + stderr
  - ListModelsParams     — provider, timeoutMs, binaryPath, env
  - ModelInfo            — single model entry (mirrors core ModelInfo)
  - ModelsListEnvelope   — { schema_version: 1, provider, fetched_at, models }

Key contracts:
  - exit 0 + parseable envelope → resolve with envelope (empty models OK)
  - exit 1 (usage error)        → reject ListModelsError, exitCode 1
  - exit 2 (provider error)     → reject ListModelsError, exitCode 2
  - timeout (default 15s)       → SIGTERM subprocess, reject
  - malformed JSON / bad schema → reject with 'invalid envelope: ...'
  - env: undefined inherits process.env; provided env passed through

9 vitest cases pass (RED → GREEN); full wrapper suite stays at 114/114.
dist/ artifacts regenerated by 'npm run build'.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…dule-not-installed

DTU integration testing of PR #49 (feat/model-overrides-and-discovery) found
two related defects in 'amplifier-agent models list':

1. Provider constructors received api_key="" instead of the real env-var
   value (ANTHROPIC_API_KEY / OPENAI_API_KEY / AZURE_OPENAI_API_KEY). The
   Anthropic SDK does NOT fall back to reading the env var when an empty
   string is explicitly passed, so live calls failed with 'Could not resolve
   authentication method' TypeError.

2. When a provider module was not yet pip-installed (fresh container,
   before the first 'run' triggers engine-mount install), ImportError was
   silently swallowed and the command returned exit 0 with an empty list
   plus the advisory 'no live model list available; enter a model name
   manually' -- visually identical to the legitimate azure-openai empty
   case.

This commit lands the failing tests that pin both regressions:
  - 9 _resolve_provider_credentials() tests (anthropic, openai, azure-openai
    legacy fallback, ollama default+env, missing-env exit-2 paths)
  - 3 _try_instantiate_provider() tests asserting credentials reach the
    constructor + backward-compat without credentials
  - 2 list_provider_models() tests for env-var passthrough and module-not-
    installed signalling
  - 2 CLI-level tests asserting exit 2 + actionable stderr (NOT misleading
    advisory) for both failure modes

All 16 new tests currently FAIL against HEAD. Implementation follows in the
next commit.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Root cause (found via DTU integration testing of PR #49):
'_try_instantiate_provider' hardcoded api_key="", base_url="http://placeholder",
and host="http://localhost:11434" as constructor args. The docstring claimed
this enabled list_models() to be invoked 'without a live configuration', but
the assumption is wrong for any provider whose list_models() makes a live
authenticated API call. The Anthropic SDK explicitly rejects api_key="" with
'Could not resolve authentication method' and does NOT fall back to reading
ANTHROPIC_API_KEY from the environment when an empty string is explicitly
passed.

Secondary symptom: when the provider module wasn't yet pip-installed in the
container (fresh container, before any 'run' triggered engine-mount install),
ImportError was caught silently inside 'load_provider_class' and the CLI
returned exit 0 with 'no live model list available; enter a model name
manually' — visually identical to the legitimate azure-openai empty case.

Fix:
- Add ProviderCredentialsMissingError and ProviderModuleNotInstalledError
  typed exceptions for the two failure modes.
- Add _resolve_provider_credentials() that reads from os.environ per the
  PROVIDER_CATALOG env_var mapping:
    anthropic    -> ANTHROPIC_API_KEY
    openai       -> OPENAI_API_KEY
    azure-openai -> AZURE_OPENAI_API_KEY, falling back to AZURE_OPENAI_KEY
    ollama       -> OLLAMA_HOST or OLLAMA_BASE_URL, defaulting to
                    http://localhost:11434
  api-key-based providers raise ProviderCredentialsMissingError when the env
  var is unset/empty; ollama is permissive because its list_models() handles
  an unreachable daemon gracefully by returning [].
- Extend _try_instantiate_provider() with a credentials= kwarg that overrides
  the placeholder defaults. Existing call sites without credentials retain
  prior behaviour.
- Update list_provider_models() to resolve credentials first, then call
  _load_provider_module() directly so ImportError can be reraised as
  ProviderModuleNotInstalledError (load_provider_class keeps its silent-None
  contract for other callers).
- Update the CLI 'models list' handler to map ProviderCredentialsMissingError
  and ProviderModuleNotInstalledError to exit code 2 with actionable stderr
  messages naming the env var (or the install path).
- Replace the misleading docstring on _try_instantiate_provider; document
  the user-visible exit-code/stderr contract in the module docstring.

Reference: amplifier_app_cli.provider_loader.get_provider_models() solves
the same problem via _resolve_env_placeholder + collected_config; this port
reads env vars directly so the CLI's documented 'set env var, run agent' UX
works without a settings file.

DTU validation, per the integration testing that surfaced this bug:
- T3/T4 anthropic models list with ANTHROPIC_API_KEY set: now returns
  populated model list (was: empty + misleading advisory).
- T8 anthropic models list with no key: exit 2 + 'ANTHROPIC_API_KEY not set'
  (was: exit 0 + empty + misleading advisory).
- Fresh container before 'run': exit 2 + 'provider module not installed,
  run amplifier-agent run --provider anthropic <prompt> once' (was: exit 0
  + same misleading advisory as the legitimate empty case).
- T5 azure-openai with AZURE_OPENAI_API_KEY set: still exit 0 + empty +
  legitimate 'enter deployment manually' advisory (correctly unchanged).
- T9 override path (--model claude-sonnet-4-5): unchanged, still works.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
… PROVIDER_CREDENTIAL_VARS

Following the Q1 follow-up architectural review (amplifier-app-cli pattern):

- PROVIDER_CATALOG entries should be bootstrap-only: `module` and `source`
  fields, nothing else. The catalog tells the kernel WHERE to install a
  provider from. Everything else flows from `provider.get_info()` at runtime.

- The env-var mapping moves out of the catalog into a small auxiliary
  `PROVIDER_CREDENTIAL_VARS` mapping that mirrors amplifier-app-cli's
  scoped credential fallback table. Each value is (primary, *legacy).

- `build_provider_entry` no longer emits `default_model` in the returned
  config when no override is supplied — the provider's own
  `get_info().defaults["model"]` is the source of truth, eliminating the
  catalog/provider drift already observed for ollama
  (catalog said "llama3.2", provider says "llama3.2:3b").

These tests are RED: they assert the post-shrink contract and fail against
the current 5-field catalog.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…mplifier-app-cli pattern

Mirrors `amplifier_app_cli.provider_loader`'s separation of install
catalog from credential mapping. Eliminates the catalog/provider
drift class that produced the ollama default_model mismatch
(catalog: "llama3.2"; provider get_info(): "llama3.2:3b").

Changes
=======

`provider_sources.py`:

- `_CatalogEntry` TypedDict shrunk from 5 fields to 2: removed
  `env_var`, `legacy_env_vars`, `default_model`.
- `PROVIDER_CATALOG` carries only `{module, source}` per provider —
  the bootstrap fields the kernel needs before the provider module
  exists locally.
- Added `PROVIDER_CREDENTIAL_VARS: dict[str, tuple[str, ...]]` as a
  small auxiliary mapping `{provider: (primary_env, *legacy_envs)}`.
  Scoped, single source of truth for env var name lookups.
- `build_provider_entry` no longer emits `default_model` in the
  returned config when no override is supplied — the provider's own
  `get_info().defaults["model"]` is the source of truth. Matches
  amplifier_app_cli.configure_provider's "no hard-coded provider
  defaults" rule.
- Introduced helper `_resolve_env_credential` that walks the credential
  vars (primary then legacy) and emits the deprecation notice on
  legacy use — same behavior, deduplicated lookup logic.

`admin/models.py`:

- `_resolve_provider_credentials` reads env var names from
  `PROVIDER_CREDENTIAL_VARS` instead of hardcoding them per-provider.
  Ollama still has its own branch because its result shape is
  `{"host": ...}` (not `{"api_key": ...}`) and an unreachable daemon
  is exit-0 + advisory rather than an error.

Regression check
================

The run command path (run --provider X --model Y) keeps working: when
--model is passed, model_override flows through inject_provider into
config["default_model"] exactly as before. When --model is omitted,
default_model is now absent from the config — the kernel and the
provider module handle the absence by sourcing the default from
provider.get_info().defaults["model"].

Verified by the existing test_run_override_lands_in_mount_plan test
plus the full 176-test cli suite. No regressions.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
… opt-in

Adds RED tests for the Q2 filter flip: the CLI's discovery-time
default to list_models() should be filtered=False (full list), with
--latest as the opt-in for the filtered subset (one model per family).

Anthropic's provider module currently defaults filtered=True, which
collapses any list_models() response to 3 entries (latest per
opus/sonnet/haiku family). DTU integration testing surfaced this as
confusing for users running `models list --provider anthropic`.

Provider-module default stays filtered=True — other callers
(spawn_utils.py, routing-matrix resolver.py) depend on it. The flip
applies only at the admin CLI layer.

New tests assert:
- `models list --provider anthropic` constructs the provider with
  config={"filtered": False}
- `models list --provider anthropic --latest` constructs with
  config={"filtered": True}
- --help advertises --latest
- list_provider_models() accepts extra_config and forwards it
- _try_instantiate_provider() accepts extra_config and threads it
  into the constructor's config arg

5 RED failures as expected; existing 31 tests still pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Q2 fix: when a user runs `amplifier-agent models list --provider
anthropic`, they expect to see every model the provider reports.
Previously the response was collapsed to 3 entries (latest per
opus/sonnet/haiku family) because the anthropic provider module
defaults to `filtered=True` and the CLI passed no override.

Implementation
==============

- `_try_instantiate_provider` accepts an optional `extra_config` dict
  that gets merged into the constructor's `config` argument. Threads
  through every constructor signature variant (anthropic/openai,
  azure, vllm, ollama, config-only). The no-arg fallback only fires
  when extra_config is empty — otherwise a filter directive would be
  silently dropped.

- `list_provider_models` accepts and forwards `extra_config`.

- The `models_list` Click command grows a `--latest` flag (default
  off). The command builds `{"filtered": bool(latest_only)}` and
  passes it through so:
    * `models list --provider anthropic`           → filtered=False (full list)
    * `models list --provider anthropic --latest`  → filtered=True (3 models)

Scope
=====

The provider module's own `filtered=True` default is unchanged. Other
callers (`amplifier_foundation.spawn_utils`,
`amplifier_bundle_routing_matrix.resolver`) rely on the existing
default. The flip is admin-CLI-scoped on purpose.

Pre-existing tests that mocked `list_provider_models` with a closed
signature now accept `**_` so the new kwarg doesn't break them.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
RED tests for Q3: when --provider is omitted, models list should
enumerate every known provider in parallel and return a per-provider
status envelope. The DTU-tested user need is "show me everything I
can talk to right now" without having to invoke the command four
times.

New tests assert:

- `--provider` is no longer required (no UsageError on omission).
- JSON envelope shape: top-level schema_version + fetched_at + results
  array (no top-level provider field). Each results entry has
  {provider, status, models[, error]}.
- Status taxonomy:
    ok                   → list_models returned (empty list is OK)
    credentials_missing  → ProviderCredentialsMissingError
    module_not_installed → ProviderModuleNotInstalledError
    error                → other exception, message captured
- Exit code: 0 when at least one provider returned ok; 2 when ALL
  providers failed (envelope still emitted so the caller can inspect).
- Table output adds PROVIDER + STATUS + MODELS columns.
- The filter directive (--latest / default) propagates to every
  per-provider call in aggregate mode.

8 RED failures as expected; existing 36 tests still pass.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…r omitted

Q3 fix: drop the requirement to pre-decide which provider to query.
When the user runs `amplifier-agent models list` with no --provider,
the command now enumerates every known provider in parallel and emits
a per-provider results envelope so the user can see at a glance which
providers are configured, which are missing credentials, which aren't
installed, and what models each one reports.

Implementation
==============

- `--provider` becomes optional (required=False, default=None). The
  single-provider behavior is unchanged when --provider is supplied.

- New helper `_aggregate_models()` runs `list_provider_models` on every
  catalog entry in parallel. Each per-provider call is wrapped in
  `asyncio.to_thread` (because list_provider_models is sync but
  internally drives an asyncio loop) and gathered via `asyncio.gather`.
  Per-provider exceptions are caught and recorded — one provider's
  failure never knocks out another.

- Status taxonomy (mirrors the single-provider error contract):
    ok                   → list_models returned (empty list is OK)
    credentials_missing  → ProviderCredentialsMissingError
    module_not_installed → ProviderModuleNotInstalledError
    error                → other exception, message captured

- JSON envelope:

      {
        "schema_version": 1,
        "fetched_at": "<ISO8601>",
        "results": [
          {"provider": "anthropic",    "status": "ok",                  "models": [...]},
          {"provider": "openai",       "status": "credentials_missing", "models": [], "error": "..."},
          ...
        ]
      }

- Table view adds PROVIDER + STATUS + MODELS columns. Model list is
  abbreviated to first 3 + "(N total)" once it exceeds 3 entries; the
  full list lives in the JSON envelope.

- Exit codes:
    0  → at least one provider returned status == "ok"
    2  → ALL providers failed (envelope still emitted so the caller can
         see the per-provider error breakdown)

- The filter directive from --latest / default propagates to every
  per-provider call in aggregate mode.

Verified by the 8 RED tests plus the full 190-test cli suite. No
regressions.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
microsoft-amplifier and others added 11 commits June 11, 2026 18:08
…n bucket

Mechanical cleanup after Cycles 1-3:
- ruff format applied to admin/models.py and tests/cli/test_admin_models.py
- Removed unused `# noqa: BLE001` directive that ruff couldn't find a
  matching rule for (the project's ruff config doesn't enable BLE001).

No behavior change. Full cli suite still 190 pass / 1 skip.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
… flags rejected

RED phase — tests assert the new contract that all provider configuration
flows through host_config.provider.{module,config}, with no CLI override
path. These tests fail against the current implementation.

tests/cli/test_single_turn.py:
  - Remove the prior --model/--effort/--provider integration tests
    (these argv flags are being removed; the contract they pinned no longer holds)
  - Add @parametrize guards that --provider/--model/--effort:
      (a) are absent from `run --help` output
      (b) are rejected by click with `No such option`
  - Add tests that host_config wires:
      provider.module           -> mount_plan[providers][0][module]
      provider.config.default_model -> config[default_model]
      provider.config.effort        -> config[effort]
      provider.config.<arbitrary>   -> pass-through (temperature, max_tokens, ...)
  - Add baseline: no host_config -> config has only api_key + priority

tests/cli/test_provider_sources.py:
  - Append tests for the new extra_config kwarg on
    build_provider_entry / inject_provider — the pass-through dict that
    carries host_config.provider.config into the mount entry. Includes a
    safety pin that extra_config cannot clobber the env-resolved api_key.

Expected failures (13): 9 in test_single_turn.py, 4 in test_provider_sources.py.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…gle source

host_config.json is now the sole source of truth for provider configuration.
The argv-flag tier is removed entirely -- no deprecation, no migration shim.

Engine (modes/single_turn.py):
  - Drop @click.option for --provider, --model, --effort
  - Drop matching parameters from `run` function signature
  - Provider resolution cascade is now just:
      host_config["provider"]["module"] > bundle default_provider
  - New cascade for per-provider config knobs:
      host_config["provider"]["config"]    -> all knobs (default_model,
                                                   effort, temperature,
                                                   max_tokens, ...)
      none                                     -> provider's get_info()
                                                   defaults only
  - _TurnSpec: swap model_override / effort_override for a single
    provider_config dict (pass-through carrier)
  - _execute_turn forwards spec.provider_config to inject_provider as
    extra_config; spec.provider still drives selection

Provider entry builder (provider_sources.py):
  - build_provider_entry / inject_provider gain an extra_config kwarg
  - extra_config is overlaid onto the final config dict after the explicit
    model_override / effort_override are applied
  - api_key is re-asserted from the env-resolved value AFTER the overlay so
    a stale host_config cannot silently downgrade a fresh credential
  - The existing model_override / effort_override kwargs are kept for direct
    callers (they're still useful as a typed shorthand even though the CLI
    no longer feeds them)

Sidesteps the merger.py module-key bug entirely: single_turn.py reads
host_config["provider"]["config"] directly and forwards it via
inject_provider, bypassing the merger.py path where 'anthropic-provider'
vs 'provider-anthropic' mismatch would silently drop the config. The bug
itself is documented as a follow-up in the design doc; this batch does
not fix it.

Tests:
  - tests/cli/test_mode_a_integration.py: the --provider-override regression
    test is replaced by a host_config.provider.module equivalent against the
    real Engine + stubbed PreparedBundle
  - tests/cli/test_mode_a_v2_real_binary.py: drop --provider anthropic from
    the real-binary happy-path invocation (anthropic is the bundle default,
    so the flag was redundant even before this change)

All 204 tests/cli/ pass (1 skipped: PGID test, Linux-only).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
… surface)

RED phase — replace the 6 prior tests (3 "emitted when set" + 3 "absent when
unset" pairs for --provider/--model/--effort) with three @ts-expect-error
guards that mirror the existing pattern used for --env-allowlist, --env-extra,
and --allow-protocol-skew.

The new guards pin both:
  1. The argv flag is no longer emitted by assembleArgv()
  2. The corresponding field (providerOverride / modelOverride /
     effortOverride) is no longer part of the AssembleArgvInput type —
     passing it must be a TypeScript type error.

Expected failures (3): the new tests fail today because assembleArgv() still
emits the flags. They turn green when argv-builder.ts drops the three
optional fields and their emission blocks (next commit).

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…pper surface

GREEN phase — the three optional fields are removed from the TS wrapper's
public input surfaces and from the argv emission, in sync with the engine
side that no longer accepts these CLI flags. host_config.provider.{module,config}
(passed in via the surviving `configPath` field) is the single source of truth.

argv-builder.ts
  - AssembleArgvInput drops providerOverride / modelOverride / effortOverride
  - assembleArgv() drops the three corresponding argv.push blocks
  - Removed-flags doc block extended to record this removal alongside the
    earlier mcp-config-path / env-* / allow-protocol-skew removals

session.ts
  - Internal session params drop providerOverride
  - assembleArgv() call site drops the forwarding line

index.ts
  - SpawnAgentParams drops providerOverride (public API)
  - The spread block that forwarded providerOverride into spawn() is gone

list-models.ts
  - Module docstring updated: the override half no longer lives in
    `modelOverride` / `effortOverride`; it lives in host_config.provider.config
    and reaches the engine via `configPath`. The discovery half is unchanged.

types.ts
  - Untouched in source. The generated wire-protocol type
    `InitializeParams.providerOverride` is a JSON-RPC method parameter, not
    the same surface as the now-removed SpawnAgentParams field. It carries
    the resolved provider name from the CLI into the engine's init call and
    is regenerated by the prebuild step from the upstream JSON schema.

dist/
  - Regenerated via `npm run build` (prebuild + tsc). All four .d.ts/.js
    pairs reflect the source changes; no manual edits.

Tests:
  - vitest: 113 passed across 21 files (argv-builder.test.ts now exercises
    the three @ts-expect-error guards added in the previous RED commit and
    confirms the engine no longer emits --provider/--model/--effort).
  - tsc --noEmit: clean.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
….py follow-up

v2 of the model-overrides-and-discovery design supersedes Change Set 1.
The original CLI-flag approach (--provider/--model/--effort) shipped on
this branch, then was reverted in favor of a config-file-only approach:
host_config.json is now the single source of truth for provider
configuration. Change Set 2 (the `models list` discovery subcommand) is
unaffected.

Additions:
  - Prominent v2 supersession callout immediately under the H1
  - New section: "v2 Override Path (supersedes Change Set 1)"
    + resolution cascade (host_config.provider.module > bundle default)
    + config-file schema with pass-through provider.config dict
    + engine wiring (single_turn -> inject_provider(extra_config=...))
    + TS wrapper surface deltas (AssembleArgvInput, SpawnAgentParams)
    + 'what did not change' (Change Set 2, catalog shrink) for clarity
  - Known follow-up sub-section: the merger.py module-key mismatch.
    `amplifier_agent_lib/config/merger.py` maps friendly provider names
    to bundle module keys with the legacy "<name>-provider" convention,
    but `inject_provider()` writes the kernel convention "provider-<name>".
    The strings do not match, so the merger's provider.config overlay is a
    silent no-op. v2 sidesteps the bug by reading host_config in
    single_turn.py and forwarding directly via extra_config; the merger
    is not on the v2 critical path. Recorded as a separate cleanup tied
    to the broader baked-in-bundle / host-config layering work (D6/D11).
  - Out-of-Scope list extended:
    + 'fixing the merger.py module-key mismatch' (tracked separately)
    + 'deprecation shims' for the removed flags (v2 is a fresh pass)

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Auto-applied formatting on the two test files added in earlier commits in
this stack. No behavioral changes -- ruff format collapsed multi-line
f-string assertion messages onto single lines and removed one blank line.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
RED test: assert that priority survives an extra_config attempt to
overwrite it. Pairs with the existing api_key-clobbering test --
both fields belong to the same 'engine-asserted, not user-tunable'
class and should be protected by the same mechanism.

NIT 3 from PR #49 code-quality review.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Factor the 'engine-asserted keys win over extra_config' convention
into a small helper. Closes two narrow concerns from the PR #49
review (NIT 1 + NIT 3):

* NIT 1 -- api_key was protected via inline re-assertion; one
  protected key per credential type would not scale as the surface
  grows (OAuth, managed identity).
* NIT 3 -- priority was unprotected and could be silently clobbered
  by a host_config that included priority in provider.config. Moot
  today (single-provider mounts) but a latent footgun.

The helper is the explicit extension point for future engine-owned
keys; callers no longer need to remember to re-assert anything.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
NIT 2 from PR #49 review. The previous code repeated the
'isinstance(host_config, dict) and isinstance(host_config.get("provider"), dict)'
guard at two adjacent reads (provider name and provider config).
Collapse to a single normalize-once block that materializes a
provider_block dict, then derives both fields from it linearly.

Behavior preserved -- exercised by existing test_single_turn tests.
No new tests; this is a readability refactor.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
NIT 4 from PR #49 review. After the --provider/--model/--effort flags
were removed from 'amplifier-agent run', several comments and one
user-facing error message still pointed at the gone surface:

* admin/models.py:352 -- ProviderModuleNotInstalledError told users
  to run 'amplifier-agent run --provider X <prompt>' to trigger an
  install. That command now fails with 'No such option'. Re-point
  at the host_config.json install path.
* provider_sources.py:77 -- KNOWN_PROVIDERS docstring's '--provider
  override' example was about the removed run flag. Re-anchor on
  'models list --provider' (which still uses the constant) and
  aggregate iteration.
* single_turn.py:53 -- _read_bundle_default_provider docstring
  mentioned the removed --provider override; the only remaining
  signal is host.provider.module.
* tests/cli/test_default_provider_fallback.py and
  test_single_turn_init_params.py -- module/test docstrings
  similarly referenced the removed flag.

The explicit 'the --provider argv flag was removed' explanatory
comment in single_turn.py:606 stays -- it documents the removal,
not a still-supported surface. wrappers/ TS sources had no matches.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@manojp99 manojp99 marked this pull request as ready for review June 12, 2026 07:28
@manojp99 manojp99 merged commit e48a644 into main Jun 12, 2026
2 of 3 checks passed
manojp99 added a commit that referenced this pull request Jun 12, 2026
#54)

Add a new listAllModels() function to expose the engine's aggregate provider
discovery mode, which queries all configured providers in parallel and returns
per-provider results with auth status.

Design choices:
- Separate function (not overload) due to structurally different envelope type
- Extracted runModelsListSubprocess() and interpretOutcome() helpers for code reuse
- Status field intentionally free-form string to permit future engine additions
- Existing listModels() behavior unchanged; backward compatible

Testing:
- 11 new test cases (A1-A11) mirror existing 9 for listModels
- All 124 tests pass
- Asserts correct argv construction (omits --provider in aggregate mode)

Motivates paperclip's amplifier-local adapter to use dynamic model discovery
instead of static list or 4 parallel single-provider calls.

Requires: engine PR #49+ for amplifier-agent models list aggregate mode support

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-authored-by: Manoj Prabhakar Paidiparthy <mpaidiparthy@microsoft.com>
Co-authored-by: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants