Skip to content

Commit 10909df

Browse files
committed
feat: replace Python Parakeet with Swift/FluidAudio implementation
- Replaced 123MB Python/MLX sidecar with 1.2MB Swift implementation - Integrated FluidAudio SDK for native Apple Neural Engine acceleration - Added automated Swift sidecar build process via build.rs - Updated model management to use FluidAudio's caching system - Improved reset functionality to clean FluidAudio cached models - Added comprehensive documentation for Swift integration - macOS-only feature using native performance optimizations - Reduced app bundle size by ~122MB while maintaining functionality
1 parent fffba52 commit 10909df

File tree

18 files changed

+1152
-128
lines changed

18 files changed

+1152
-128
lines changed

CLAUDE.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,12 +95,20 @@ pnpm typecheck # Run TypeScript compiler
9595

9696
**Completed**:
9797
- Core recording and transcription functionality
98-
- Model download and management
98+
- Model download and management (Whisper + Parakeet)
99+
- **NEW**: Swift/FluidAudio Parakeet sidecar (1.2MB vs 123MB Python)
99100
- Settings persistence
100101
- Comprehensive test suite (110+ tests)
101102
- Error boundaries and recovery
102103
- Global hotkey support
103104

105+
📝 **Recent Updates**:
106+
- Parakeet Swift integration complete (see `PARAKEET_SWIFT_INTEGRATION.md`)
107+
- Native Apple Neural Engine support for **macOS only** (see `PARAKEET_MACOS_ONLY_FIX.md`)
108+
- Automated sidecar build via `build.rs`
109+
- Parakeet V2 removed, only V3 available
110+
- Dynamic engine detection (whisper/parakeet)
111+
104112
### Common Patterns
105113

106114
1. **Error Handling**: Always wrap risky operations in try-catch

PARAKEET_SWIFT_INTEGRATION.md

Lines changed: 331 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,331 @@
1+
# Parakeet Swift Integration - Implementation Summary
2+
3+
## 🎯 Overview
4+
5+
Successfully replaced the 123MB Python/MLX Parakeet sidecar with a 1.2MB Swift/FluidAudio implementation that provides native macOS transcription using Apple Neural Engine.
6+
7+
**⚠️ Platform Support**: This integration is **macOS-only**. Windows and Linux will continue to use Whisper models exclusively.
8+
9+
---
10+
11+
## ✅ What Was Implemented
12+
13+
### 1. Swift Sidecar (`/sidecar/parakeet-swift/`)
14+
15+
**Files Created:**
16+
- `Sources/main.swift` - Main sidecar logic with FluidAudio integration
17+
- `Package.swift` - Swift package configuration
18+
- `build.sh` - Automated build script with proper target triple naming
19+
- `README.md` - Comprehensive documentation
20+
- `.gitignore` - Git ignore rules for build artifacts
21+
22+
**Features:**
23+
- ✅ JSON-based communication protocol (stdin/stdout)
24+
- ✅ Commands: `load_model`, `transcribe`, `delete_model`, `unload_model`, `status`
25+
- ✅ FluidAudio SDK integration (v0.5.2)
26+
- ✅ Apple Neural Engine acceleration
27+
- ✅ Proper error handling and status responses
28+
- ✅ Model caching managed by FluidAudio
29+
30+
### 2. Rust Backend Integration
31+
32+
**Modified Files:**
33+
- `src-tauri/src/parakeet/manager.rs` - Delegates to Swift sidecar
34+
- `src-tauri/src/parakeet/messages.rs` - Added `DeleteModel` command
35+
- `src-tauri/src/commands/reset.rs` - Clears FluidAudio cache
36+
- `src-tauri/src/commands/model.rs` - Unified model management
37+
- `src-tauri/build.rs` - Automatically builds Swift sidecar
38+
- `src-tauri/tauri.conf.json` - Updated externalBin path
39+
40+
**Key Improvements:**
41+
- ✅ Proper model availability checking via FluidAudio cache
42+
- ✅ Health check function for sidecar verification
43+
- ✅ Download/delete operations delegate to Swift
44+
- ✅ Reset App Data clears all FluidAudio cached files
45+
46+
### 3. Build System
47+
48+
**Automated Build Process:**
49+
1. `pnpm tauri dev` or `pnpm tauri build`
50+
2. → Triggers `src-tauri/build.rs`
51+
3. → Runs `sidecar/parakeet-swift/build.sh`
52+
4. → Produces `dist/parakeet-sidecar-aarch64-apple-darwin`
53+
5. → Tauri bundles it automatically
54+
55+
**Target Triple Handling:**
56+
- macOS ARM64: `aarch64-apple-darwin`
57+
- macOS Intel: `x86_64-apple-darwin`
58+
- Future: Linux/Windows targets configurable
59+
60+
---
61+
62+
## 📊 Benefits
63+
64+
| Metric | Old (Python/MLX) | New (Swift/FluidAudio) | Improvement |
65+
|--------|------------------|------------------------|-------------|
66+
| Binary Size | 123 MB | 1.2 MB | **99% smaller** |
67+
| Download Size | 123 MB + 500 MB models | 1.2 MB + 500 MB models | Same models, tiny binary |
68+
| Performance | MLX (CPU/GPU) | Apple Neural Engine | **Native acceleration** |
69+
| User Control | Auto-download | User clicks Download | **Better UX** |
70+
| macOS Integration | Python runtime | Native Swift | **Fully native** |
71+
72+
---
73+
74+
## 🔄 Data Flow
75+
76+
### Download Flow
77+
```
78+
1. User clicks "Download" in Settings
79+
2. Frontend → Rust: download_model(model_name)
80+
3. Rust → Swift: {"type": "load_model", "model_id": "..."}
81+
4. Swift → FluidAudio: AsrModels.downloadAndLoad()
82+
5. FluidAudio downloads CoreML to ~/Library/Application Support/
83+
6. Swift → Rust: {"type": "status", "loaded_model": "..."}
84+
7. Rust → Frontend: model-downloaded event
85+
```
86+
87+
### Transcription Flow
88+
```
89+
1. User records audio
90+
2. Frontend → Rust: transcribe(audio_path)
91+
3. Rust → Swift: {"type": "transcribe", "audio_path": "..."}
92+
4. Swift → FluidAudio: asrManager.transcribe(fileURL)
93+
5. FluidAudio uses Apple Neural Engine
94+
6. Swift → Rust: {"type": "transcription", "text": "..."}
95+
7. Rust → Frontend: Insert text at cursor
96+
```
97+
98+
### Delete Flow
99+
```
100+
1. User clicks "Remove" in Settings
101+
2. Frontend → Rust: delete_model(model_name)
102+
3. Rust → Swift: {"type": "delete_model"}
103+
4. Swift deletes:
104+
- ~/Library/Application Support/FluidAudio/
105+
- ~/Library/Application Support/parakeet-tdt-0.6b-v3-coreml/
106+
- ~/Library/Caches/FluidAudio/
107+
5. Swift → Rust: {"type": "status", "loaded_model": null}
108+
6. Rust → Frontend: model-deleted event
109+
```
110+
111+
### Reset App Data Flow
112+
```
113+
1. User clicks "Reset App Data"
114+
2. Frontend → Rust: reset_app_data()
115+
3. Rust clears:
116+
- FluidAudio cache directories
117+
- Old Parakeet tracking dirs
118+
- Tauri stores (settings, transcriptions)
119+
- Secure store (API keys)
120+
- System preferences
121+
4. Rust → Frontend: reset-complete event
122+
```
123+
124+
---
125+
126+
## 🧪 Testing Checklist
127+
128+
### Manual Testing Required
129+
130+
- [ ] **Build Test**: `pnpm tauri dev` compiles Swift sidecar
131+
- [ ] **Health Check**: App starts without sidecar errors
132+
- [ ] **Download**: Click Download, verify ~500MB CoreML downloads
133+
- [ ] **Status Check**: Downloaded model shows as available
134+
- [ ] **Transcription**: Record audio, verify transcription works
135+
- [ ] **Quality**: Check transcription accuracy vs Whisper
136+
- [ ] **Delete**: Click Remove, verify files deleted from disk
137+
- [ ] **Re-download**: Download again after delete
138+
- [ ] **Reset App Data**: Verify all Parakeet data cleared
139+
- [ ] **Persistence**: Model selection survives app restart
140+
141+
### Automated Tests Needed (Future)
142+
143+
```rust
144+
// Integration test idea
145+
#[tokio::test]
146+
async fn test_parakeet_sidecar_communication() {
147+
let app = test_app();
148+
let manager = ParakeetManager::new(temp_dir());
149+
150+
// Health check
151+
assert!(manager.health_check(&app).await.is_ok());
152+
153+
// Status check
154+
let response = manager.client.send(&app, &ParakeetCommand::Status {}).await.unwrap();
155+
assert!(matches!(response, ParakeetResponse::Status { .. }));
156+
}
157+
```
158+
159+
---
160+
161+
## 🚨 Known Limitations
162+
163+
### Platform Limitations
164+
165+
1. **macOS Only**: Swift/FluidAudio is macOS-exclusive (by design)
166+
- **Backend**: Returns empty Parakeet model list on Windows/Linux
167+
- **Frontend**: Dynamically detects engine from selected model
168+
- Windows/Linux: Only Whisper models appear in UI
169+
- Future: May add Windows-specific native models if available
170+
171+
2. **Model Availability Heuristic**:
172+
- Currently checks if FluidAudio cache directories exist
173+
- Not 100% accurate if user manually deletes files
174+
- **Improvement**: Query sidecar status on app startup
175+
176+
3. **No Progress for Model Download**:
177+
- FluidAudio doesn't expose download progress
178+
- UI shows indeterminate spinner
179+
- User must wait ~2-5 minutes for 500MB download
180+
181+
4. **Single Model Support**:
182+
- Only Parakeet TDT 0.6B v3 currently available
183+
- FluidAudio may support more models in future
184+
185+
### Future Improvements
186+
187+
- [ ] Expose FluidAudio download progress (if SDK adds support)
188+
- [ ] Add proper model availability query on startup
189+
- [ ] Support multiple Parakeet model variants
190+
- [ ] Add offline mode detection (warn if no internet for download)
191+
- [ ] Implement model update mechanism
192+
193+
---
194+
195+
## 📝 Files Modified
196+
197+
### New Files
198+
```
199+
sidecar/parakeet-swift/Sources/main.swift
200+
sidecar/parakeet-swift/Package.swift
201+
sidecar/parakeet-swift/build.sh
202+
sidecar/parakeet-swift/README.md
203+
sidecar/parakeet-swift/.gitignore
204+
PARAKEET_SWIFT_INTEGRATION.md (this file)
205+
```
206+
207+
### Modified Files
208+
```
209+
src-tauri/build.rs
210+
src-tauri/tauri.conf.json
211+
src-tauri/src/parakeet/manager.rs (macOS-only logic added)
212+
src-tauri/src/parakeet/models.rs (removed V2, macOS-only)
213+
src-tauri/src/parakeet/messages.rs
214+
src-tauri/src/commands/reset.rs
215+
src-tauri/src/commands/model.rs
216+
src/components/onboarding/OnboardingDesktop.tsx (dynamic engine detection)
217+
```
218+
219+
### Unchanged (Already Configured)
220+
```
221+
src-tauri/src/parakeet/sidecar.rs (communication logic)
222+
src-tauri/capabilities/macos.json (sidecar permissions)
223+
src-tauri/capabilities/default.json (sidecar permissions)
224+
```
225+
226+
---
227+
228+
## 🎓 Lessons Learned
229+
230+
### Tauri v2 Sidecar Best Practices
231+
232+
1. **Binary Naming**: Must follow `binary-name-$TARGET_TRIPLE` format
233+
- Example: `parakeet-sidecar-aarch64-apple-darwin`
234+
- Tauri automatically appends target triple when spawning
235+
236+
2. **externalBin Path**: Points to base name WITHOUT target triple
237+
- ✅ Correct: `"../sidecar/parakeet-swift/dist/parakeet-sidecar"`
238+
- ❌ Wrong: `"../sidecar/parakeet-swift/dist/parakeet-sidecar-aarch64-apple-darwin"`
239+
240+
3. **Build Integration**: Use `build.rs` for automated compilation
241+
- Runs before Tauri build
242+
- Gracefully handles build failures
243+
- Supports incremental builds
244+
245+
4. **Permissions**: Configure in `capabilities/*.json`
246+
- `shell:allow-spawn` for launching sidecar
247+
- `shell:allow-stdin-write` for sending commands
248+
249+
5. **Communication**: JSON over stdin/stdout is reliable
250+
- Use line-delimited JSON
251+
- Always flush stdout after writing
252+
- Handle stderr for debugging
253+
254+
### Swift/FluidAudio Specifics
255+
256+
1. **Package Management**: Swift Package Manager is straightforward
257+
- Dependencies resolve automatically
258+
- Release builds are optimized and small
259+
260+
2. **FluidAudio SDK**: v0.5.2 is stable
261+
- Requires macOS 13.0+
262+
- Handles model caching automatically
263+
- Returns simple `ASRResult` struct
264+
265+
3. **JSON Serialization**: Swift Codable is powerful
266+
- Use `CodingKeys` enum for snake_case conversion
267+
- Default values in structs don't decode (use initializers)
268+
269+
---
270+
271+
## 🚀 Next Steps
272+
273+
### Immediate (Before Release)
274+
275+
1. **Test End-to-End Flow**
276+
```bash
277+
pnpm tauri dev
278+
# → Test: Download → Transcribe → Remove → Reset
279+
```
280+
281+
2. **Verify Build Process**
282+
```bash
283+
pnpm tauri build
284+
# → Ensure sidecar is bundled in .app
285+
```
286+
287+
3. **Check Binary Signing** (for distribution)
288+
- Swift binary must be code-signed
289+
- Include in notarization process
290+
291+
### Future Enhancements
292+
293+
1. **Universal Binary**: Build for both ARM64 and Intel
294+
```bash
295+
# In build.sh, support lipo for universal binaries
296+
swift build -c release --arch arm64 --arch x86_64
297+
```
298+
299+
2. **Model Selection**: Add UI for multiple Parakeet models
300+
- Query FluidAudio for available models
301+
- Let user choose between speed/accuracy tradeoffs
302+
303+
3. **Offline Support**: Detect network issues
304+
- Show clear error if download fails
305+
- Suggest downloading when connected
306+
307+
4. **Performance Monitoring**: Track transcription metrics
308+
- Time to transcribe
309+
- Model load time
310+
- Memory usage
311+
312+
---
313+
314+
## 📚 References
315+
316+
- [Tauri v2 Sidecar Documentation](https://v2.tauri.app/develop/sidecar/)
317+
- [FluidAudio SDK](https://github.com/FluidInference/FluidAudio)
318+
- [Swift Package Manager Guide](https://swift.org/package-manager/)
319+
- [Apple Neural Engine](https://developer.apple.com/machine-learning/core-ml/)
320+
321+
---
322+
323+
## ✨ Credits
324+
325+
- **FluidAudio Team**: For excellent CoreML speech-to-text SDK
326+
- **Tauri Team**: For robust sidecar support in v2
327+
- **VoiceTypr Community**: For testing and feedback
328+
329+
---
330+
331+
**Status**: ✅ Implementation Complete | 🧪 Testing Required | 📦 Ready for Integration

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,5 +70,6 @@
7070
"typescript": "~5.6.2",
7171
"vite": "^6.3.6",
7272
"vitest": "^3.2.4"
73-
}
73+
},
74+
"packageManager": "pnpm@10.13.1+sha512.37ebf1a5c7a30d5fabe0c5df44ee8da4c965ca0c5af3dbab28c3a1681b70a256218d05c81c9c0dcf767ef6b8551eb5b960042b9ed4300c59242336377e01cfad"
7475
}

sidecar/parakeet-swift/.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
.DS_Store
2+
/.build
3+
/Packages
4+
xcuserdata/
5+
DerivedData/
6+
.swiftpm/configuration/registries.json
7+
.swiftpm/xcode/package.xcworkspace/contents.xcworkspacedata
8+
.netrc
9+
dist/

sidecar/parakeet-swift/Package.resolved

Lines changed: 15 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)