What's Changed
- Add DACVAE Codec by @Blaizzy in #2
- Add CI/CD tests by @Blaizzy in #3
- Add GLM ASR (STT) by @Blaizzy in #4
- Stop audio and generation when backgrounded to fix app crash by @adrgrondin in #8
- Fix the framework build so it can be referenced in Xcode by @lucasnewman in #10
- MLX-Audio Swift SDK v1 by @Blaizzy in #1
- Update README for modular SDK architecture and remove Kokoro TTS by @Copilot in #11
- Project cleanup by @lucasnewman in #12
- Add a basic CLI tool for TTS generation by @lucasnewman in #15
- Add support for Marvis TTS by @lucasnewman in #16
- Add Pocket TTS by @lucasnewman in #19
- Easier Voices app build by @lucasnewman in #20
- Fix voice matching for Pocket TTS by @lucasnewman in #21
- Use voices from the model repo instead of hardcoding by @lucasnewman in #22
- Move STT models to Models folder by @Blaizzy in #24
- Add Qwen3 ASR by @Blaizzy in #26
- Fix Soprano model by @lucasnewman in #28
- Add READMEs for each STT/TTS model by @lucasnewman in #27
- Add a simple local chat example by @lucasnewman in #25
- [Qwen3-ASR] Add live transcript support and lower peak usage by @Blaizzy in #32
- Add Qwen3-TTS conditional generation (VoiceDesign) support by @INQTR in #23
- [VAD / Diarization] Add sortformer by @Blaizzy in #33
- Refactor tests by @Blaizzy in #34
- Improved TTS sampling param handling by @lucasnewman in #30
- [STS] Add MossFormer2 SE speech enhancement module by @beshkenadze in #29
- [Qwen3ASR] Add STT + Live transcript to UI by @Blaizzy in #39
- Add remaining support for Qwen3 TTS base models by @lucasnewman in #45
- fix(sts): MossFormer2 speech enhancement quality — Kaldi fbank, RoPE, local loading by @beshkenadze in #44
- Add SAM Audio STS model by @lucasnewman in #46
- Add command line tool for STT by @lucasnewman in #47
- Fix loading of quantized Soprano models by @lucasnewman in #50
- Add Parakeet STT models by @lucasnewman in #51
- Add Voxtral Realtime STT model by @lucasnewman in #52
- Add LFM-2.5-Audio by @Blaizzy in #53
- Refactor sts app by @Blaizzy in #56
- Fix quantization for Qwen3 TTS models by @lucasnewman in #61
- Fix Voices App Example by @ikenwoo in #59
- Allow configuring the cache location for downloaded models by @lucasnewman in #60
- Remove dead code and fix all warnings by @lucasnewman in #62
- Add streaming support to the built-in audio player by @lucasnewman in #63
- Add Smart Turn v3 model by @lucasnewman in #64
- Fix Swift 6.2 compilation errors: @sendable closures and OptionSet type inference by @BenRacicot in #66
- Semantic VAD example by @lucasnewman in #67
- Add stock voices and clean up UI in the Voices app by @lucasnewman in #68
- Fix audio entitlements for MacOS and file validation in resolveDownload by @Blaizzy in #69
- Add streaming audio chunk support in Qwen3TTS model by @Blaizzy in #70
New Contributors
- @Blaizzy made their first contribution in #2
- @adrgrondin made their first contribution in #8
- @lucasnewman made their first contribution in #10
- @Copilot made their first contribution in #11
- @INQTR made their first contribution in #23
- @beshkenadze made their first contribution in #29
- @ikenwoo made their first contribution in #59
- @BenRacicot made their first contribution in #66
Full Changelog: https://github.com/Blaizzy/mlx-audio-swift/commits/v0.1.0