Skip to content

feat(tts): add configurable computeUnits for Kokoro models#482

Merged
Alex-Wengg merged 4 commits intoFluidInference:mainfrom
IntegrIT-Solutions:feat/kokoro-configurable-compute-units
Apr 4, 2026
Merged

feat(tts): add configurable computeUnits for Kokoro models#482
Alex-Wengg merged 4 commits intoFluidInference:mainfrom
IntegrIT-Solutions:feat/kokoro-configurable-compute-units

Conversation

@integrITsolutions
Copy link
Copy Markdown
Contributor

@integrITsolutions integrITsolutions commented Apr 4, 2026

Summary

Adds a computeUnits parameter (default: .all) to TtsModels.download(), KokoroTtsManager.init(), and KokoroModelCache.init(), allowing callers to override CoreML compute units for Kokoro model loading.

Problem

iOS 26 (beta, Build 23E246) introduces ANE compiler regressions that cause Kokoro models to fail with:

Error: Cannot retrieve vector from IRValue format int32
Unable to compute the asynchronous prediction using ML Program

This is a known ecosystem-wide issue affecting CoreML models on iOS 26 (see whisper.cpp#3702, executorch#15833, Apple Developer Forums thread 799456). The root cause is changes in the ANE compiler/runtime that break models compiled with computeUnits: .all.

Solution

Exposes the computeUnits parameter so callers can use .cpuAndGPU on iOS 26+ to bypass the ANE, matching the approach PocketTTS already uses to avoid ANE float16 precision artifacts.

Backwards compatible: The default remains .all, preserving existing behavior on iOS 17-18.

Changes

  • TtsModels.swift: Added computeUnits parameter to download(), piped to DownloadUtils.loadModels()
  • KokoroTtsManager.swift: Added computeUnits parameter to init(), stored and passed to TtsModels.download() and KokoroModelCache
  • KokoroModelCache.swift: Added computeUnits parameter to init(), piped to TtsModels.download() in loadModelsIfNeeded()

Usage

// iOS 26+ workaround
let manager = KokoroTtsManager(computeUnits: .cpuAndGPU)
try await manager.initialize()

// Existing behavior unchanged (default .all)
let manager = KokoroTtsManager()
try await manager.initialize()

Testing

  • Verified Kokoro initialization succeeds with .cpuAndGPU on iOS 26.4 beta (iPhone 14 Pro, A16)
  • Default .all behavior unchanged on older iOS versions
  • No API breaking changes

Open with Devin

Adds a `computeUnits` parameter (default: `.all`) to `TtsModels.download()`,
`KokoroTtsManager.init()`, and `KokoroModelCache.init()`, allowing callers
to override CoreML compute units for Kokoro model loading.

This is needed because iOS 26 introduces ANE compiler regressions that cause
Kokoro models to fail with "Cannot retrieve vector from IRValue format int32"
when loaded with `.all` (which includes the Neural Engine). Using `.cpuAndGPU`
bypasses the ANE and resolves the issue, matching the approach already used
by PocketTTS to avoid ANE float16 precision artifacts.

The default `.all` preserves existing behavior on iOS 17-18. Callers on
iOS 26+ can pass `.cpuAndGPU` to work around the ANE regression.

Example:
```swift
let manager = KokoroTtsManager(computeUnits: .cpuAndGPU)
try await manager.initialize()
```
devin-ai-integration[bot]

This comment was marked as resolved.

integrITsolutions and others added 3 commits April 4, 2026 18:54
When KokoroTtsManager was initialized with a custom computeUnits but no
directory (the common case), the modelCache default parameter was used
as-is with .all compute units, silently ignoring the caller's setting.
This meant on-demand model loading could still hit the ANE, defeating
the iOS 26 workaround.

Make modelCache optional (nil = not user-provided) so we always create
a cache with the correct computeUnits when the caller doesn't supply
their own.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Alex-Wengg Alex-Wengg merged commit 57551cd into FluidInference:main Apr 4, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants