fix: ensure vision-capable models for image attachments by ngoiyaeric · Pull Request #388 · QueueLab/QCX

ngoiyaeric · 2025-12-30T08:22:50Z

User description

Problem

When users attach images to their messages, the AI does not return tokens (streaming responses). The system appears to process the image but fails to generate a proper response.

Root Cause

The getModel() function was returning models that don't support vision/multimodal inputs:

xAI (grok-4-fast-non-reasoning): Does not support vision
AWS Bedrock: Had an empty model ID, causing failures
No detection of image content to select appropriate models

Solution

Enhanced getModel(): Added requireVision parameter to ensure vision-capable models are used when images are present
Updated researcher agent: Detects images in messages and requests vision-capable models
Updated resolution-search agent: Uses vision models for image analysis
Fixed Bedrock config: Set valid Claude 3.5 Sonnet model ID as default
Improved type safety: Properly typed multimodal content

Changes

lib/utils/index.ts: Added requireVision parameter to getModel()
lib/agents/researcher.tsx: Added image detection logic
lib/agents/resolution-search.tsx: Added image detection logic
app/actions.tsx: Improved content type safety

Testing

The fix ensures that when images are attached:

Content is properly formatted as multimodal array
Image detection triggers use of vision-capable models (gpt-4o or Claude 3.5 Sonnet)
Tokens stream back to the user successfully

Fixes the issue described in the attached documentation.

PR Type

Bug fix, Enhancement

Description

Add requireVision parameter to getModel() for vision-capable model selection
Fix AWS Bedrock configuration with valid Claude 3.5 Sonnet model ID
Detect images in researcher and resolution-search agents
Improve type safety for multimodal content handling

Diagram Walkthrough

flowchart LR
  A["Image Attachment"] --> B["Detect Image Content"]
  B --> C["Set requireVision Flag"]
  C --> D["getModel with Vision Support"]
  D --> E["Vision-Capable Model Selected"]
  E --> F["Proper Token Streaming"]

File Walkthrough

Relevant files

Bug fix

index.ts `Add vision parameter and fix Bedrock model config` lib/utils/index.ts Added `requireVision` parameter to `getModel()` function to conditionally select vision-capable models Fixed AWS Bedrock configuration by setting default Claude 3.5 Sonnet model ID Skip xAI grok model when vision is required since it doesn't support multimodal inputs Updated comments to clarify vision support for each model provider	+7/-7

Enhancement

actions.tsx `Improve type safety for multimodal content` app/actions.tsx Improved type safety for multimodal content by properly typing `content` as `CoreMessage['content']` Added explicit type casting for message objects to `CoreMessage` Detect image presence in message parts to determine content structure	+4/-3
researcher.tsx `Add image detection for vision model selection` lib/agents/researcher.tsx Added image detection logic to check if any message contains image content Pass `hasImage` flag to `getModel()` to request vision-capable models when needed Enables proper handling of image attachments in research agent	+7/-1
resolution-search.tsx `Add image detection for resolution search agent` lib/agents/resolution-search.tsx Added image detection logic to identify image content in messages Pass `hasImage` flag to `getModel()` for vision-capable model selection Ensures resolution search agent uses appropriate models for image analysis	+7/-1

Summary by CodeRabbit

New Features
- Automatic image detection to pick vision-capable AI models when conversations include images.
Refactor
- Improved type safety for message content and message handling.
- Model selection logic enhanced to consider whether vision is required and to use appropriate provider fallbacks.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Add requireVision parameter to getModel() to select vision-capable models - Update researcher agent to detect images and use vision models - Update resolution-search agent to use vision models for image analysis - Fix AWS Bedrock configuration with valid Claude 3.5 Sonnet model ID - Improve type safety for multimodal content in actions.tsx Fixes the issue where image attachments were not returning tokens because non-vision models were being used for multimodal content processing.

vercel · 2025-12-30T08:22:55Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
qcx	Ready	Preview, Comment	Dec 30, 2025 8:37am

coderabbitai · 2025-12-30T08:23:01Z

Walkthrough

Introduces image-detection and vision-aware model selection across agents and utilities, adds explicit CoreMessage['content'] typing when submitting messages, and changes getModel() to accept a requireVision boolean that influences provider selection and xAI usage.

Changes

Cohort / File(s)	Summary
Model selection util `lib/utils/index.ts`	`getModel()` signature changed to `getModel(requireVision: boolean = false)`; provider selection now conditions xAI usage on `requireVision`; `bedrockModelId` reads from `BEDROCK_MODEL_ID` with a default.
Agent image detection `lib/agents/researcher.tsx`, `lib/agents/resolution-search.tsx`	Scan message content parts for type `'image'`; compute `hasImage` and pass it to `getModel(hasImage)`; use selected model for `streamText`/`generateObject`.
Message typing / actions `app/actions.tsx`	Explicitly type `content` as `CoreMessage['content']` in the hasImage branch and cast the pushed message as `CoreMessage` when adding to `messages`.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Action as Action Handler
    participant Agent as Agent (Researcher / ResolutionSearch)
    participant ModelSelection as getModel(requireVision)
    participant ModelAPI as Model Provider

    Client->>Action: Submit message (may include images)
    Action->>Action: Build message object\ncontent typed as CoreMessage['content']
    Action->>Agent: Invoke agent with messages
    Agent->>Agent: Inspect message.content parts\nset hasImage flag
    Agent->>ModelSelection: Call getModel(hasImage)
    alt hasImage = true
        Note right of ModelSelection `#BFD7EA`: Vision required\nxAI may be considered/skipped based on flag
        ModelSelection->>ModelAPI: Return vision-capable model
    else hasImage = false
        Note right of ModelSelection `#E8F6E8`: Vision not required\nstandard provider selection
        ModelSelection->>ModelAPI: Return standard model
    end
    Agent->>ModelAPI: streamText / generateObject using selected model
    ModelAPI->>Agent: Response / stream
    Agent->>Client: Return streamed/generated output

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: Enable file attachments in chat #303: Modifies app/actions.tsx message content handling and introduces image-aware model selection logic overlapping with these changes.

Suggested labels

Review effort 4/5

Poem

🐰 I hopped through code with eager paws,
Images found — I sounded the cause.
Models picked with sight in mind,
Typed messages neat and kind.
Hooray — a smarter, vision-friendly cause! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding vision model selection logic for image attachments by introducing a requireVision parameter and image detection across agents.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

qodo-code-review · 2025-12-30T08:23:18Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢	No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
🔴	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Missing config validation: The Bedrock initialization path does not validate `awsRegion` and can attempt to create a Bedrock client with an undefined region, leading to runtime failures without clear, contextual handling. Referred Code const xaiApiKey = process.env.XAI_API_KEY const awsAccessKeyId = process.env.AWS_ACCESS_KEY_ID const awsSecretAccessKey = process.env.AWS_SECRET_ACCESS_KEY const awsRegion = process.env.AWS_REGION const bedrockModelId = process.env.BEDROCK_MODEL_ID \|\| 'anthropic.claude-3-5-sonnet-20241022-v2:0' // If vision is required, skip models that don't support it if (!requireVision && xaiApiKey) { const xai = createXai({ apiKey: xaiApiKey, baseURL: 'https://api.x.ai/v1', }) // Optionally, add a check for credit status or skip xAI if credits are exhausted try { return xai('grok-4-fast-non-reasoning') } catch (error) { console.warn('xAI API unavailable, falling back to OpenAI:') } } // AWS Bedrock - Claude models support vision ... (clipped 5 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Weak env validation: The new Bedrock selection logic uses a default `bedrockModelId` and only checks for access keys, but does not validate required configuration like `AWS_REGION`, which can cause insecure/undefined behavior when selecting external providers. Referred Code const xaiApiKey = process.env.XAI_API_KEY const awsAccessKeyId = process.env.AWS_ACCESS_KEY_ID const awsSecretAccessKey = process.env.AWS_SECRET_ACCESS_KEY const awsRegion = process.env.AWS_REGION const bedrockModelId = process.env.BEDROCK_MODEL_ID \|\| 'anthropic.claude-3-5-sonnet-20241022-v2:0' // If vision is required, skip models that don't support it if (!requireVision && xaiApiKey) { const xai = createXai({ apiKey: xaiApiKey, baseURL: 'https://api.x.ai/v1', }) // Optionally, add a check for credit status or skip xAI if credits are exhausted try { return xai('grok-4-fast-non-reasoning') } catch (error) { console.warn('xAI API unavailable, falling back to OpenAI:') } } // AWS Bedrock - Claude models support vision ... (clipped 5 lines) Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2025-12-30T08:24:17Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Always request a vision-capable model To ensure the `resolutionSearch` agent always uses a vision-capable model, modify the `getModel` call to always pass `true` instead of the `hasImage` variable. lib/agents/resolution-search.tsx [42-54] // Check if any message contains an image (resolution search is specifically for image analysis) const hasImage = messages.some(message => Array.isArray(message.content) && message.content.some(part => part.type === 'image') ) // Use generateObject to get the full object at once. const { object } = await generateObject({ - model: getModel(hasImage), + model: getModel(true), system: systemPrompt, messages: filteredMessages, schema: resolutionSearchSchema, }) Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that the `resolutionSearch` agent's purpose is image analysis, so it should always request a vision-capable model, making the agent more robust.	Medium
Possible issue	Ensure Bedrock model supports vision Add a condition in `getModel` to verify that the `bedrockModelId` corresponds to a vision-capable model (e.g., 'claude-3') when `requireVision` is true, falling back to another provider if it does not. lib/utils/index.ts [40-55] -// AWS Bedrock - Claude models support vision +// AWS Bedrock - Claude 3 models support vision if (awsAccessKeyId && awsSecretAccessKey && bedrockModelId) { - const bedrock = createAmazonBedrock({ - bedrockOptions: { - region: awsRegion, - credentials: { - accessKeyId: awsAccessKeyId, - secretAccessKey: awsSecretAccessKey, + // Only use Bedrock for vision if a Claude 3 model is specified. + if (requireVision && !bedrockModelId.includes('claude-3')) { + // Do not use Bedrock for vision if the model is not a known vision model, + // fall through to the next provider (OpenAI). + } else { + const bedrock = createAmazonBedrock({ + bedrockOptions: { + region: awsRegion, + credentials: { + accessKeyId: awsAccessKeyId, + secretAccessKey: awsSecretAccessKey, + }, }, - }, - }) - const model = bedrock(bedrockModelId, { - additionalModelRequestFields: { top_k: 350 }, - }) - return model + }) + const model = bedrock(bedrockModelId, { + additionalModelRequestFields: { top_k: 350 }, + }) + return model + } } Apply / Chat Suggestion importance[1-10]: 6 __ Why: This suggestion improves the robustness of model selection by adding a check to ensure the configured Bedrock model supports vision when required, preventing potential runtime errors.	Low
Update

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8977bad and f89ae98.

📒 Files selected for processing (4)

app/actions.tsx
lib/agents/researcher.tsx
lib/agents/resolution-search.tsx
lib/utils/index.ts

🧰 Additional context used

🧬 Code graph analysis (3)

lib/agents/resolution-search.tsx (2)

lib/db/schema.ts (1)

messages (26-37)

lib/utils/index.ts (1)

getModel (19-62)

app/actions.tsx (3)

components/chat-panel.tsx (1)

e (72-107)

lib/actions/chat.ts (1)

msg (119-127)

lib/actions/chat-db.ts (1)

msg (117-121)

lib/agents/researcher.tsx (2)

lib/db/schema.ts (1)

messages (26-37)

lib/utils/index.ts (1)

getModel (19-62)

🔇 Additional comments (5)

lib/agents/researcher.tsx (2)

99-103: Image detection logic looks correct.

The implementation properly checks for multimodal content by inspecting message.content arrays for image parts. This aligns with the PR objective to enable vision-capable model selection when images are present.

106-106: Type assertion is necessary due to missing return type annotation on getModel().

The getModel() function in lib/utils/index.ts lacks an explicit return type annotation and returns instances from different provider factories (createXai, createAmazonBedrock, createOpenAI). Without the as LanguageModel cast, TypeScript cannot guarantee that these provider instances are recognized as LanguageModel. To ensure type safety across all callers, the assertion should be retained. Alternatively, add an explicit return type annotation to getModel() to eliminate the need for repeated casts at every call site.

Likely an incorrect or invalid review comment.

app/actions.tsx (1)

253-282: Type safety improvements look good.

The explicit typing of content as CoreMessage['content'] (line 254) and the message cast (line 282) properly align the multimodal content handling with the expected CoreMessage structure. The conditional logic correctly handles both string content (text-only) and array content (with images).

lib/utils/index.ts (2)

19-38: requireVision parameter correctly gates model selection.

The implementation properly skips xAI's non-vision model when requireVision is true, ensuring that only vision-capable models (Bedrock Claude or OpenAI gpt-4o) are selected for image processing tasks. This directly addresses the root cause described in the PR objectives.

24-24: The default Bedrock model ID is valid and supported.

The model anthropic.claude-3-5-sonnet-20241022-v2:0 is a current Claude 3.5 Sonnet version supported by AWS Bedrock. No changes needed. Regional availability and IAM permissions depend on your AWS configuration—verify these at runtime using your AWS credentials if the model fails to load.

coderabbitai · 2025-12-30T08:26:40Z

+  // Check if any message contains an image (resolution search is specifically for image analysis)
+  const hasImage = messages.some(message => 
+    Array.isArray(message.content) && 
+    message.content.some(part => part.type === 'image')
+  )
+
  // Use generateObject to get the full object at once.
  const { object } = await generateObject({
-    model: getModel(),
+    model: getModel(hasImage),


🧹 Nitpick | 🔵 Trivial

Consider validating that images are actually present.

Since resolutionSearch is specifically designed for satellite image analysis (per the system prompt), consider adding a validation check to ensure images are present before proceeding. This would provide clearer error messages if called incorrectly.

🔎 Suggested validation

const hasImage = messages.some(message => Array.isArray(message.content) && message.content.some(part => part.type === 'image') ) + + if (!hasImage) { + throw new Error('resolutionSearch requires at least one image in the messages') + } const { object } = await generateObject({

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Check if any message contains an image (resolution search is specifically for image analysis)

const hasImage = messages.some(message =>

Array.isArray(message.content) &&

message.content.some(part => part.type === 'image')

)

// Use generateObject to get the full object at once.

const { object } = await generateObject({

model: getModel(),

model: getModel(hasImage),

// Check if any message contains an image (resolution search is specifically for image analysis)

const hasImage = messages.some(message =>

Array.isArray(message.content) &&

message.content.some(part => part.type === 'image')

)

if (!hasImage) {

throw new Error('resolutionSearch requires at least one image in the messages')

}

// Use generateObject to get the full object at once.

const { object } = await generateObject({

model: getModel(hasImage),

🤖 Prompt for AI Agents

In lib/agents/resolution-search.tsx around lines 42 to 50, the code only detects image parts but does not halt when none exist; add an explicit validation after computing hasImage that checks if hasImage is false and then return or throw a clear, contextual error (e.g., "resolutionSearch requires at least one image in messages") so the function short-circuits before calling generateObject/getModel; ensure the validation uses the same Array.isArray(message.content) && message.content.some(...) check and that the error path provides a helpful message for callers and logs where appropriate.

coderabbitai · 2025-12-30T08:26:40Z

+  // Check if any message contains an image (resolution search is specifically for image analysis)
+  const hasImage = messages.some(message => 
+    Array.isArray(message.content) && 
+    message.content.some(part => part.type === 'image')
+  )


🧹 Nitpick | 🔵 Trivial

Consider extracting duplicate image detection logic.

This exact image detection pattern appears in both lib/agents/researcher.tsx (lines 99-103) and here. Consider extracting it to a shared utility function to maintain consistency and reduce duplication.

🔎 Example helper function

Add to lib/utils/index.ts:

+export function hasImageContent(messages: CoreMessage[]): boolean { + return messages.some(message => + Array.isArray(message.content) && + message.content.some(part => part.type === 'image') + ) +}

Then use it in both files:

- const hasImage = messages.some(message => - Array.isArray(message.content) && - message.content.some(part => part.type === 'image') - ) + const hasImage = hasImageContent(messages)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In lib/agents/resolution-search.tsx around lines 42 to 46 the image-detection logic (checking message.content is an array and any part.type === 'image') is duplicated elsewhere (lib/agents/researcher.tsx lines 99-103); extract this into a shared utility (e.g., export function hasImage(messages: Message[]): boolean) placed in lib/utils/index.ts, implement the same array-and-part-type checks once in that function, export it, then replace the inline detection in both files with an import of the new hasImage utility and call it (adjust imports/typing accordingly).

charliecreates

The core direction is correct, but getModel(requireVision) does not actually guarantee a vision-capable model when Bedrock is configured with an arbitrary BEDROCK_MODEL_ID, which can reintroduce the original failure mode. app/actions.tsx still has a lossy conversion path (joining parts to string) that may break if non-text parts exist, and it uses a broad as CoreMessage cast that can mask structural issues. Image detection logic is duplicated across agents and is inconsistent in resolution-search (computed from a different message set than the one sent to the model). Overall reliability would improve with a centralized messagesHaveImage() helper and stricter runtime selection/validation for vision models.

Additional notes (4)

Maintainability | app/actions.tsx:279-282
Casting the entire object to CoreMessage (} as CoreMessage) hides structural issues and makes it easy to accidentally omit required fields (or include incompatible ones) without noticing. Since you already have a CoreMessage['content'] variable, it’s better to construct an object that is explicitly a CoreMessage (or ensure the array messages is typed so the push is checked) rather than forcing a cast at the end.
Maintainability | lib/agents/researcher.tsx:99-103
The hasImage detection is duplicated across agents and is subtly coupled to the exact content representation (Array.isArray(message.content) + part.type === 'image'). If the message format evolves (different part types, or nested content), you’ll need to update multiple locations and risk inconsistent behavior.

Given this PR’s theme (vision gating), centralizing image detection will reduce drift and make it easier to test.

Compatibility | lib/utils/index.ts:19-19
getModel(requireVision) currently assumes Bedrock is vision-capable whenever credentials exist and a model id is set. However, BEDROCK_MODEL_ID is configurable and could be set to a non-vision model, which would reintroduce the original failure mode when requireVision=true.

Since this function is now responsible for guaranteeing vision support, it should enforce that guarantee (or have an explicit allowlist of known vision-capable IDs) when requireVision is true.

Readability | lib/utils/index.ts:40-40
When requireVision is true, the function skips xAI (good), but the selection order is still Bedrock-first, then OpenAI. That’s fine, but there’s no explicit check that the Bedrock region is set (awsRegion). If AWS_REGION is missing, this will likely fail at runtime and prevent fallback.

Given the goal is reliability for image inputs, consider requiring region (or catching Bedrock creation errors and falling back).

Summary of changes

What changed

Added vision-aware model selection by extending getModel() to accept a requireVision: boolean parameter and using it to avoid xAI non-vision models when images are present.
Fixed AWS Bedrock default model id by reading process.env.BEDROCK_MODEL_ID with a fallback to anthropic.claude-3-5-sonnet-20241022-v2:0.
Added image detection in both agents:
- lib/agents/researcher.tsx detects multimodal image parts and calls getModel(hasImage).
- lib/agents/resolution-search.tsx does the same for image analysis and calls getModel(hasImage).
Improved message content typing in app/actions.tsx by using CoreMessage['content'] instead of any, and casting the pushed message to CoreMessage.

Files touched

app/actions.tsx
lib/agents/researcher.tsx
lib/agents/resolution-search.tsx
lib/utils/index.ts

charliecreates · 2025-12-30T08:34:45Z

  const hasImage = messageParts.some(part => part.type === 'image')
-  const content = hasImage
-    ? (messageParts as any)
+  // Properly type the content based on whether it contains images
+  const content: CoreMessage['content'] = hasImage
+    ? messageParts as CoreMessage['content']
    : messageParts.map(part => part.text).join('\n')


content is typed as CoreMessage['content'], but the code only checks hasImage to decide whether to send an array vs. a joined string. This can silently produce an invalid payload when non-image non-text parts exist (e.g., future file, audio, tool parts) or if a text part is missing .text. In that case, the non-image branch will do part.text and potentially drop/flatten data.

Given the stated goal (multimodal correctness), it’s safer to build content by validating/normalizing parts and only joining when you’re sure all parts are text.

Suggestion

Tighten the branching to ensure the string path only runs when every part is text, otherwise keep the array form:

const isAllText = messageParts.every(p => p.type === 'text') const content: CoreMessage['content'] = isAllText ? messageParts.map(p => p.text).join('\n') : (messageParts as CoreMessage['content'])

This avoids lossy conversion if additional part types are introduced later. Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this change.

charliecreates · 2025-12-30T08:34:45Z

  const filteredMessages = messages.filter(msg => msg.role !== 'system');

+  // Check if any message contains an image (resolution search is specifically for image analysis)
+  const hasImage = messages.some(message => 
+    Array.isArray(message.content) && 
+    message.content.some(part => part.type === 'image')
+  )
+
  // Use generateObject to get the full object at once.
  const { object } = await generateObject({
-    model: getModel(),
+    model: getModel(hasImage),


filteredMessages is computed, but hasImage is checked against messages (unfiltered). If a system message ever contains image parts (or the caller passes different arrays), you could incorrectly select a vision model when it’s not needed, or vice versa.

For correctness and consistency, compute hasImage from the actual messages array you pass to the model (filteredMessages).

Suggestion

Base the check on filteredMessages:

const hasImage = filteredMessages.some(message => Array.isArray(message.content) && message.content.some(part => part.type === 'image') )

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this fix.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

lib/utils/index.ts (2)

37-37: Fix misleading fallback message.

The error message claims fallback to OpenAI, but the actual fallback order is Gemini → Bedrock → OpenAI.

🔎 Suggested fix

-      console.warn('xAI API unavailable, falling back to OpenAI:')
+      console.warn('xAI API unavailable, falling back to next provider')

55-69: Consider adding error handling for consistency.

Unlike xAI and Gemini providers, the Bedrock block lacks a try-catch. If Bedrock initialization fails, the function won't fall back to OpenAI. Adding error handling would improve reliability and consistency, especially for vision-required scenarios.

🔎 Suggested improvement

   if (awsAccessKeyId && awsSecretAccessKey) {
-    const bedrock = createAmazonBedrock({
-      bedrockOptions: {
-        region: awsRegion,
-        credentials: {
-          accessKeyId: awsAccessKeyId,
-          secretAccessKey: awsSecretAccessKey,
-        },
-      },
-    })
-    const model = bedrock(bedrockModelId, {
-      additionalModelRequestFields: { top_k: 350 },
-    })
-    return model
+    try {
+      const bedrock = createAmazonBedrock({
+        bedrockOptions: {
+          region: awsRegion,
+          credentials: {
+            accessKeyId: awsAccessKeyId,
+            secretAccessKey: awsSecretAccessKey,
+          },
+        },
+      })
+      const model = bedrock(bedrockModelId, {
+        additionalModelRequestFields: { top_k: 350 },
+      })
+      return model
+    } catch (error) {
+      console.warn('AWS Bedrock unavailable, falling back to next provider:', error)
+    }
   }

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f89ae98 and 8ec42ae.

📒 Files selected for processing (1)

lib/utils/index.ts

🔇 Additional comments (4)

lib/utils/index.ts (4)

19-19: LGTM! Clean function signature with clear intent.

The requireVision parameter with a sensible default provides an explicit, backward-compatible API for vision-aware model selection.

71-76: LGTM! Clear documentation of vision support.

The updated comment accurately reflects that gpt-4o supports vision, making it a suitable fallback for multimodal inputs.

41-51: Gemini 3 Pro vision support is confirmed.

Gemini 3 Pro is natively multimodal and supports image inputs (up to 900 images per prompt, with support for PNG, JPEG, WebP, HEIC, and HEIF formats). The code correctly does not skip Gemini when requireVision=true.

25-25: No issues found. The model ID 'anthropic.claude-3-5-sonnet-20241022-v2:0' is valid and documented in AWS Bedrock, and Claude 3.5 Sonnet supports vision/multimodal capabilities as required by the requireVision parameter in the getModel() function.

coderabbitai · 2025-12-30T08:39:54Z

+  // If vision is required, skip models that don't support it
+  if (!requireVision && xaiApiKey) {


🧹 Nitpick | 🔵 Trivial

Good vision-gating logic.

Correctly skips xAI when vision is required, as xAI models don't support multimodal inputs.

Optional: Consider clarifying the comment

- // If vision is required, skip models that don't support it + // If vision is required, skip xAI (no vision support) and try vision-capable providers

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// If vision is required, skip models that don't support it

if (!requireVision && xaiApiKey) {

// If vision is required, skip xAI (no vision support) and try vision-capable providers

if (!requireVision && xaiApiKey) {

🤖 Prompt for AI Agents

In lib/utils/index.ts around lines 27 to 28, the existing comment "If vision is required, skip models that don't support it" is misleading given the conditional; update the comment to accurately describe the logic (e.g., explain that when vision is not required and an xAI API key is present, xAI-only models are considered, otherwise xAI is skipped) so future readers understand the condition and intent.

This commit documents that the branch has been synchronized with the latest changes from main branch. All recent updates have been merged including: - Gemini 3 Pro model support (PR #389) - Image attachment token fixes (PR #388) - Comprehensive E2E test suite (PR #350) - Playwright GitHub Actions CI/CD - All dependency updates and bug fixes The Supabase backend implementation and collaboration features from this PR have been preserved and are compatible with the latest main branch changes.

charliecreates Bot requested a review from CharlieHelps December 30, 2025 08:22

qodo-code-review Bot added the Review effort 2/5 label Dec 30, 2025

vercel Bot deployed to Preview – qcx December 30, 2025 08:24 View deployment

coderabbitai Bot reviewed Dec 30, 2025

View reviewed changes

charliecreates Bot reviewed Dec 30, 2025

View reviewed changes

charliecreates Bot removed the request for review from CharlieHelps December 30, 2025 08:34

Merge branch 'main' into fix/image-attachment-tokens

8ec42ae

vercel Bot deployed to Preview – qcx December 30, 2025 08:37 View deployment

coderabbitai Bot reviewed Dec 30, 2025

View reviewed changes

ngoiyaeric merged commit 13697f3 into main Dec 30, 2025
5 checks passed

		// If vision is required, skip models that don't support it
		if (!requireVision && xaiApiKey) {

Uh oh!

Conversation

ngoiyaeric commented Dec 30, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Problem

Root Cause

Solution

Changes

Testing

PR Type

Description

Diagram Walkthrough

File Walkthrough

Summary by CodeRabbit

Uh oh!

vercel Bot commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Pre-merge checks and finishing touches

Uh oh!

qodo-code-review Bot commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review Bot commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates Bot left a comment

Choose a reason for hiding this comment

What changed

Files touched

Uh oh!

charliecreates Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ngoiyaeric commented Dec 30, 2025 •

edited by coderabbitai Bot

Loading

vercel Bot commented Dec 30, 2025 •

edited

Loading

coderabbitai Bot commented Dec 30, 2025 •

edited

Loading

qodo-code-review Bot commented Dec 30, 2025 •

edited

Loading

qodo-code-review Bot commented Dec 30, 2025 •

edited

Loading