Successfully replaced the 123MB Python/MLX Parakeet sidecar with a 1.2MB Swift/FluidAudio implementation that provides native macOS transcription using Apple Neural Engine.
Files Created:
Sources/main.swift- Main sidecar logic with FluidAudio integrationPackage.swift- Swift package configurationbuild.sh- Automated build script with proper target triple namingREADME.md- Comprehensive documentation.gitignore- Git ignore rules for build artifacts
Features:
- ✅ JSON-based communication protocol (stdin/stdout)
- ✅ Commands:
load_model,transcribe,delete_model,unload_model,status - ✅ FluidAudio SDK integration (v0.5.2)
- ✅ Apple Neural Engine acceleration
- ✅ Proper error handling and status responses
- ✅ Model caching managed by FluidAudio
Modified Files:
src-tauri/src/parakeet/manager.rs- Delegates to Swift sidecarsrc-tauri/src/parakeet/messages.rs- AddedDeleteModelcommandsrc-tauri/src/commands/reset.rs- Clears FluidAudio cachesrc-tauri/src/commands/model.rs- Unified model managementsrc-tauri/build.rs- Automatically builds Swift sidecarsrc-tauri/tauri.conf.json- Updated externalBin path
Key Improvements:
- ✅ Proper model availability checking via FluidAudio cache
- ✅ Health check function for sidecar verification
- ✅ Download/delete operations delegate to Swift
- ✅ Reset App Data clears all FluidAudio cached files
Automated Build Process:
pnpm tauri devorpnpm tauri build- → Triggers
src-tauri/build.rs - → Runs
sidecar/parakeet-swift/build.sh - → Produces
dist/parakeet-sidecar-aarch64-apple-darwin - → Tauri bundles it automatically
Target Triple Handling:
- macOS ARM64:
aarch64-apple-darwin - macOS Intel:
x86_64-apple-darwin - Future: Linux/Windows targets configurable
| Metric | Old (Python/MLX) | New (Swift/FluidAudio) | Improvement |
|---|---|---|---|
| Binary Size | 123 MB | 1.2 MB | 99% smaller |
| Download Size | 123 MB + 500 MB models | 1.2 MB + 500 MB models | Same models, tiny binary |
| Performance | MLX (CPU/GPU) | Apple Neural Engine | Native acceleration |
| User Control | Auto-download | User clicks Download | Better UX |
| macOS Integration | Python runtime | Native Swift | Fully native |
1. User clicks "Download" in Settings
2. Frontend → Rust: download_model(model_name)
3. Rust → Swift: {"type": "load_model", "model_id": "..."}
4. Swift → FluidAudio: AsrModels.downloadAndLoad()
5. FluidAudio downloads CoreML to ~/Library/Application Support/
6. Swift → Rust: {"type": "status", "loaded_model": "..."}
7. Rust → Frontend: model-downloaded event
1. User records audio
2. Frontend → Rust: transcribe(audio_path)
3. Rust → Swift: {"type": "transcribe", "audio_path": "..."}
4. Swift → FluidAudio: asrManager.transcribe(fileURL)
5. FluidAudio uses Apple Neural Engine
6. Swift → Rust: {"type": "transcription", "text": "..."}
7. Rust → Frontend: Insert text at cursor
1. User clicks "Remove" in Settings
2. Frontend → Rust: delete_model(model_name)
3. Rust → Swift: {"type": "delete_model"}
4. Swift deletes:
- ~/Library/Application Support/FluidAudio/
- ~/Library/Application Support/parakeet-tdt-0.6b-v3-coreml/
- ~/Library/Caches/FluidAudio/
5. Swift → Rust: {"type": "status", "loaded_model": null}
6. Rust → Frontend: model-deleted event
1. User clicks "Reset App Data"
2. Frontend → Rust: reset_app_data()
3. Rust clears:
- FluidAudio cache directories
- Old Parakeet tracking dirs
- Tauri stores (settings, transcriptions)
- Secure store (API keys)
- System preferences
4. Rust → Frontend: reset-complete event
- Build Test:
pnpm tauri devcompiles Swift sidecar - Health Check: App starts without sidecar errors
- Download: Click Download, verify ~500MB CoreML downloads
- Status Check: Downloaded model shows as available
- Transcription: Record audio, verify transcription works
- Quality: Check transcription accuracy vs Whisper
- Delete: Click Remove, verify files deleted from disk
- Re-download: Download again after delete
- Reset App Data: Verify all Parakeet data cleared
- Persistence: Model selection survives app restart
// Integration test idea
#[tokio::test]
async fn test_parakeet_sidecar_communication() {
let app = test_app();
let manager = ParakeetManager::new(temp_dir());
// Health check
assert!(manager.health_check(&app).await.is_ok());
// Status check
let response = manager.client.send(&app, &ParakeetCommand::Status {}).await.unwrap();
assert!(matches!(response, ParakeetResponse::Status { .. }));
}-
macOS Only: Swift/FluidAudio is macOS-exclusive (by design)
- Backend: Returns empty Parakeet model list on Windows/Linux
- Frontend: Dynamically detects engine from selected model
- Windows/Linux: Only Whisper models appear in UI
- Future: May add Windows-specific native models if available
-
Model Availability Heuristic:
- Currently checks if FluidAudio cache directories exist
- Not 100% accurate if user manually deletes files
- Improvement: Query sidecar status on app startup
-
No Progress for Model Download:
- FluidAudio doesn't expose download progress
- UI shows indeterminate spinner
- User must wait ~2-5 minutes for 500MB download
-
Single Model Support:
- Only Parakeet TDT 0.6B v3 currently available
- FluidAudio may support more models in future
- Expose FluidAudio download progress (if SDK adds support)
- Add proper model availability query on startup
- Support multiple Parakeet model variants
- Add offline mode detection (warn if no internet for download)
- Implement model update mechanism
sidecar/parakeet-swift/Sources/main.swift
sidecar/parakeet-swift/Package.swift
sidecar/parakeet-swift/build.sh
sidecar/parakeet-swift/README.md
sidecar/parakeet-swift/.gitignore
PARAKEET_SWIFT_INTEGRATION.md (this file)
src-tauri/build.rs
src-tauri/tauri.conf.json
src-tauri/src/parakeet/manager.rs (macOS-only logic added)
src-tauri/src/parakeet/models.rs (removed V2, macOS-only)
src-tauri/src/parakeet/messages.rs
src-tauri/src/commands/reset.rs
src-tauri/src/commands/model.rs
src/components/onboarding/OnboardingDesktop.tsx (dynamic engine detection)
src-tauri/src/parakeet/sidecar.rs (communication logic)
src-tauri/capabilities/macos.json (sidecar permissions)
src-tauri/capabilities/default.json (sidecar permissions)
-
Binary Naming: Must follow
binary-name-$TARGET_TRIPLEformat- Example:
parakeet-sidecar-aarch64-apple-darwin - Tauri automatically appends target triple when spawning
- Example:
-
externalBin Path: Points to base name WITHOUT target triple
- ✅ Correct:
"../sidecar/parakeet-swift/dist/parakeet-sidecar" - ❌ Wrong:
"../sidecar/parakeet-swift/dist/parakeet-sidecar-aarch64-apple-darwin"
- ✅ Correct:
-
Build Integration: Use
build.rsfor automated compilation- Runs before Tauri build
- Gracefully handles build failures
- Supports incremental builds
-
Permissions: Configure in
capabilities/*.jsonshell:allow-spawnfor launching sidecarshell:allow-stdin-writefor sending commands
-
Communication: JSON over stdin/stdout is reliable
- Use line-delimited JSON
- Always flush stdout after writing
- Handle stderr for debugging
-
Package Management: Swift Package Manager is straightforward
- Dependencies resolve automatically
- Release builds are optimized and small
-
FluidAudio SDK: v0.5.2 is stable
- Requires macOS 13.0+
- Handles model caching automatically
- Returns simple
ASRResultstruct
-
JSON Serialization: Swift Codable is powerful
- Use
CodingKeysenum for snake_case conversion - Default values in structs don't decode (use initializers)
- Use
-
Test End-to-End Flow
pnpm tauri dev # → Test: Download → Transcribe → Remove → Reset -
Verify Build Process
pnpm tauri build # → Ensure sidecar is bundled in .app -
Check Binary Signing (for distribution)
- Swift binary must be code-signed
- Include in notarization process
-
Universal Binary: Build for both ARM64 and Intel
# In build.sh, support lipo for universal binaries swift build -c release --arch arm64 --arch x86_64 -
Model Selection: Add UI for multiple Parakeet models
- Query FluidAudio for available models
- Let user choose between speed/accuracy tradeoffs
-
Offline Support: Detect network issues
- Show clear error if download fails
- Suggest downloading when connected
-
Performance Monitoring: Track transcription metrics
- Time to transcribe
- Model load time
- Memory usage
- FluidAudio Team: For excellent CoreML speech-to-text SDK
- Tauri Team: For robust sidecar support in v2
- VoiceTypr Community: For testing and feedback
Status: ✅ Implementation Complete | 🧪 Testing Required | 📦 Ready for Integration