feat(app): add custom vocabulary support for Parakeet CTC boosting by pasrom · Pull Request #77 · pasrom/meeting-transcriber

pasrom · 2026-04-01T07:56:27Z

Summary

Enable domain-specific vocabulary boosting for Parakeet ASR via FluidAudio's CTC keyword spotting
After TDT transcription, CTC model runs constrained decoding to detect and correct vocabulary terms
Settings: file picker for custom vocabulary (one term per line)
Wired through AppSettings → AppState → ParakeetEngine

Inspired by @execsumo's work in #70.

Stacked on #76 (VAD)

Test plan

7 unit tests (defaults, persistence, engine wiring, empty/missing path handling)
Build passes
Lint clean (0 violations)
Manual: create vocab file, select in Settings, transcribe with Parakeet

@execsumo

Voice Activity Detection removes silence before transcription, improving accuracy and speed for recordings with significant pauses. Uses FluidAudio's Silero VAD v6 model (CoreML/ANE) to detect speech regions, then: - Extracts speech-only audio for transcription - Remaps timestamps back to original timeline New types: SpeechRegion, VadSegmentMap (pure, testable), FluidVAD (wrapper). Settings: vadEnabled (default off), vadThreshold (0.3–0.9 slider). Pipeline: single-source path preprocesses with VAD when enabled. Inspired by @execsumo's work in #70.

@execsumo

Enable domain-specific vocabulary boosting for the Parakeet ASR engine using FluidAudio's CTC keyword spotting pipeline. After TDT transcription, the CTC model runs constrained decoding to detect and correct vocabulary terms (e.g. company names, product names) that TDT may have misrecognized. - Add customVocabularyPath to AppSettings (persisted via UserDefaults) - Add configureVocabulary() to ParakeetEngine that loads vocab file, downloads CTC models, and initializes VocabularyRescorer - Apply CTC rescoring post-transcription in transcribeSegments() - Add vocabulary file picker in SettingsView (Parakeet section) - Wire customVocabularyPath from AppSettings to ParakeetEngine in AppState - Add CustomVocabularyTests (7 tests: defaults, persistence, engine wiring) - Update Package.resolved to match main branch (FluidAudio 0.13.4) Inspired by @execsumo's work in #70.

pasrom added 2 commits April 1, 2026 09:49

github-actions bot added the enhancement New feature or request label Apr 1, 2026

pasrom mentioned this pull request Apr 1, 2026

feat(app): add Sortformer diarizer mode for overlap-aware speaker diarization #78

Merged

4 tasks

Base automatically changed from feat/vad-preprocessing to main April 1, 2026 08:00

pasrom merged commit 54e330b into main Apr 1, 2026
4 checks passed

pasrom deleted the feat/custom-vocabulary branch April 1, 2026 08:01

pasrom mentioned this pull request Apr 1, 2026

Migrate to FluidAudio, add VAD, LS-EEND diarization, and build fixes #70

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(app): add custom vocabulary support for Parakeet CTC boosting#77

feat(app): add custom vocabulary support for Parakeet CTC boosting#77
pasrom merged 2 commits intomainfrom
feat/custom-vocabulary

pasrom commented Apr 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pasrom commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pasrom commented Apr 1, 2026 •

edited

Loading