Skip to content

Fetch model catalog from network#2013

Merged
luispater merged 5 commits intodevfrom
model
Mar 10, 2026
Merged

Fetch model catalog from network#2013
luispater merged 5 commits intodevfrom
model

Conversation

@hkfires
Copy link
Collaborator

@hkfires hkfires commented Mar 10, 2026

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility and maintainability of the application's model catalog. By transitioning from hardcoded Go structs to a dynamically fetched JSON file, the system can now update its supported AI models from a remote source at startup. This change streamlines the process of adding or modifying model definitions. Furthermore, it introduces more granular control over Codex models, enabling the application to offer specific model tiers to users based on their subscription plan, improving alignment with provider offerings.

Highlights

  • Model Catalog Externalization: The static model definitions have been moved from Go source code into a dedicated models.json file.
  • Dynamic Model Updates: A new mechanism has been introduced to fetch and update the model catalog from network URLs at application startup, allowing for dynamic updates without requiring a new build or deployment.
  • Tiered Codex Model Support: The system now dynamically selects the appropriate Codex model tier (Free, Team, Plus, Pro) based on the user's plan type, which is extracted from their authentication token.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cmd/server/main.go
    • Imported the new registry package.
    • Initiated the StartModelsUpdater function during application startup in both standalone and proxy service modes.
  • internal/registry/model_definitions.go
    • Updated comments to reflect that model metadata is now loaded from an embedded JSON file and can be refreshed from the network.
    • Introduced new structs (AntigravityModelConfig, staticModelsJSON) to mirror the JSON structure.
    • Added numerous Get*Models() functions to retrieve model definitions from the central modelsCatalogStore.
    • Modified GetStaticModelDefinitionsByChannel to use GetCodexProModels for the "codex" channel.
    • Updated LookupStaticModelInfo to fetch models from the new modelsCatalogStore and clone them.
    • Added helper functions cloneAntigravityModelConfig and cloneModelInfos for deep cloning model configurations.
  • internal/registry/model_definitions_static_data.go
    • Removed the file, as its contents have been migrated to models/models.json and the new dynamic loading mechanism.
  • internal/registry/model_updater.go
    • Added a new file responsible for loading model definitions from an embedded models/models.json file and fetching updates from predefined network URLs.
    • Implemented logic for a one-time refresh of models at startup with a timeout.
    • Included functions for unmarshaling JSON data and validating the model catalog structure.
    • Utilized sync.Once to ensure the updater runs only once.
  • internal/registry/models/models.json
    • Added a new JSON file containing the comprehensive catalog of AI model definitions, previously hardcoded in model_definitions_static_data.go.
  • internal/runtime/executor/claude_executor_test.go
    • Adjusted attribute keys in a test case for Claude executor.
  • internal/watcher/synthesizer/file.go
    • Imported the codex package.
    • Added logic to parse the JWT id_token for "codex" provider to extract and store the plan_type attribute.
  • sdk/api/handlers/openai/openai_responses_websocket_test.go
    • Removed an extraneous blank line in a test file.
  • sdk/auth/codex_device.go
    • Included the plan_type attribute in the AuthRecord when building it for Codex authenticators.
  • sdk/cliproxy/service.go
    • Modified the registerModelsForAuth function to dynamically select specific Codex model lists (e.g., GetCodexProModels, GetCodexPlusModels) based on the extracted plan_type from the authentication attributes.
Ignored Files
  • Ignored by pattern: .github/workflows/** (3)
    • .github/workflows/docker-image.yml
    • .github/workflows/pr-test-build.yml
    • .github/workflows/release.yaml
Activity
  • No specific activity (comments, reviews, progress) was provided in the context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the model catalog to be fetched from the network, with an embedded fallback, instead of being hardcoded. This is a significant improvement for maintainability. The changes also introduce support for tiered models for the 'codex' provider based on user plan type. The implementation is solid, but I've identified a bug in the model lookup logic and a potential resource leak in the new network fetching code. My review includes suggestions to address these issues.

Comment on lines 206 to 216
allModels := [][]*ModelInfo{
GetClaudeModels(),
GetGeminiModels(),
GetGeminiVertexModels(),
GetGeminiCLIModels(),
GetAIStudioModels(),
GetOpenAIModels(),
GetQwenModels(),
GetIFlowModels(),
GetKimiModels(),
data.Claude,
data.Gemini,
data.Vertex,
data.GeminiCLI,
data.AIStudio,
data.CodexPro,
data.Qwen,
data.IFlow,
data.Kimi,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The allModels slice in LookupStaticModelInfo is missing the model lists for CodexFree, CodexTeam, and CodexPlus. This will cause lookups for models in these tiers to fail, returning nil even if they are defined in the model catalog. To ensure all models can be found, these tiers should be added to the slice.

 		data.Claude,
 		data.Gemini,
 		data.Vertex,
 		data.GeminiCLI,
 		data.AIStudio,
 		data.CodexFree,
 		data.CodexTeam,
 		data.CodexPlus,
 		data.CodexPro,
 		data.Qwen,
 		data.IFlow,
 		data.Kimi,

Comment on lines +61 to +104
func tryRefreshModels(ctx context.Context) {
client := &http.Client{Timeout: modelsFetchTimeout}
for _, url := range modelsURLs {
reqCtx, cancel := context.WithTimeout(ctx, modelsFetchTimeout)
req, err := http.NewRequestWithContext(reqCtx, "GET", url, nil)
if err != nil {
cancel()
log.Debugf("models fetch request creation failed for %s: %v", url, err)
continue
}

resp, err := client.Do(req)
if err != nil {
cancel()
log.Debugf("models fetch failed from %s: %v", url, err)
continue
}

if resp.StatusCode != 200 {
resp.Body.Close()
cancel()
log.Debugf("models fetch returned %d from %s", resp.StatusCode, url)
continue
}

data, err := io.ReadAll(resp.Body)
resp.Body.Close()
cancel()

if err != nil {
log.Debugf("models fetch read error from %s: %v", url, err)
continue
}

if err := loadModelsFromBytes(data, url); err != nil {
log.Warnf("models parse failed from %s: %v", url, err)
continue
}

log.Infof("models updated from %s", url)
return
}
log.Warn("models refresh failed from all URLs, using current data")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The tryRefreshModels function can be simplified and made more robust by using defer for resource cleanup. The current implementation has manual cancel() calls that are easy to miss in some error paths, potentially leading to context leaks. Using defer cancel() and defer resp.Body.Close() is the idiomatic Go pattern for ensuring resources are always released.

func tryRefreshModels(ctx context.Context) {
	client := &http.Client{Timeout: modelsFetchTimeout}
	for _, url := range modelsURLs {
		reqCtx, cancel := context.WithTimeout(ctx, modelsFetchTimeout)

		err := func() error {
			defer cancel()
			req, err := http.NewRequestWithContext(reqCtx, "GET", url, nil)
			if err != nil {
				return fmt.Errorf("request creation failed: %w", err)
			}

			resp, err := client.Do(req)
			if err != nil {
				return fmt.Errorf("http do failed: %w", err)
			}
			defer resp.Body.Close()

			if resp.StatusCode != http.StatusOK {
				return fmt.Errorf("bad status: %d", resp.StatusCode)
			}

			data, err := io.ReadAll(resp.Body)
			if err != nil {
				return fmt.Errorf("read body failed: %w", err)
			}

			if err := loadModelsFromBytes(data, url); err != nil {
				return fmt.Errorf("load models failed: %w", err)
			}
			return nil
		}()

		if err != nil {
			log.Debugf("models fetch from %s failed: %v", url, err)
			continue
		}

		log.Infof("models updated from %s", url)
		return
	}
	log.Warn("models refresh failed from all URLs, using current data")
}

switch strings.ToLower(codexPlanType) {
case "pro":
models = registry.GetCodexProModels()
case "plus":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting missing or unknown plan_type to pro looks dangerous here. This makes any Codex auth with missing / unparsable tier metadata advertise pro-only models such as gpt-5.4 even though the account may actually be team / free. Because the tier is reconstructed from auth metadata on reload, this is not just a theoretical case for older or hand-authored files. I would strongly prefer a safe fallback here (lowest/common tier, or no tier-only expansion until the tier is known) plus a regression test for the missing-tier path.

// It blocks until the startup fetch attempt finishes so service initialization
// can wait for the refreshed catalog before registering auth-backed models.
// Safe to call multiple times; only one refresh will run.
func StartModelsUpdater(ctx context.Context) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking startup on a network fetch here is a real operational regression. In the worst case this now waits up to roughly 60s across the two URLs before the server is fully up, and this client does not use the proxy settings that the rest of CLIProxyAPI honors. In environments that require an outbound proxy, startup will always pay that penalty and still fall back to the embedded catalog. I think this should either run asynchronously after startup, or use the existing config-aware HTTP transport so it behaves like the rest of the system.

steps:
- name: Checkout
uses: actions/checkout@v4
- name: Refresh models catalog

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refreshing models.json from the network during CI means the PR is no longer tested against the catalog content actually committed in this branch. The same commit can produce different results on different days depending on the external file state. That makes CI and releases non-reproducible. I would rather pin the catalog to a reviewed file in the repo (or at least a specific content SHA) and update it through a separate refresh workflow / PR.

@@ -35,7 +160,7 @@ func GetStaticModelDefinitionsByChannel(channel string) []*ModelInfo {
case "aistudio":
return GetAIStudioModels()
case "codex":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The catalog is now split into codex-free/team/plus/pro, but the static lookup path still treats codex as pro only. That means the management static definitions endpoint and LookupStaticModelInfo() will drift from the new tiered catalog design as soon as there is a model that exists only in a non-pro tier or differs across tiers. I think the static lookup path should either merge all Codex tiers for lookup purposes or be made explicitly tier-aware too.

Copy link

@excelwang excelwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling the catalog refresh and Codex tier split. I think the direction makes sense, but I have a few blocking concerns before this is safe to merge:

  1. Codex auths with missing / unknown plan_type currently fall back to pro, which can advertise models that the account is not actually entitled to use.
  2. Startup is now blocked on an outbound network fetch with a plain http.Client, so proxy-only or slow-network environments can pay a large startup penalty before falling back to the embedded catalog.
  3. The workflow changes make CI / release builds non-reproducible because the catalog content is mutated from the network at build time instead of being tied to the commit under test.
  4. The static Codex lookup path is still effectively pro-only even though the catalog is now tiered, so the design is not fully closed yet.

I left line comments with the concrete details and suggested directions.

@luispater luispater merged commit cf74ed2 into dev Mar 10, 2026
2 checks passed
@luispater luispater deleted the model branch March 10, 2026 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants