-
Northwestern Polytechnical University
- China
Pinned Loading
-
Qwen3-TTS
Qwen3-TTS PublicForked from QwenLM/Qwen3-TTS
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
Python
-
Qwen3-Omni
Qwen3-Omni PublicForked from QwenLM/Qwen3-Omni
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Jupyter Notebook
-
Qwen2.5-Omni
Qwen2.5-Omni PublicForked from QwenLM/Qwen2.5-Omni
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Jupyter Notebook
-
SoulX-Podcast
SoulX-Podcast PublicForked from Soul-AILab/SoulX-Podcast
SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.
Python
-
OSUM
OSUM PublicForked from ASLP-lab/OSUM
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
Python
-
VoiceSculptor
VoiceSculptor PublicForked from ASLP-lab/VoiceSculptor
An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.
Python
If the problem persists, check the GitHub status page or contact support.

