Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
-
Updated
Nov 19, 2024 - Python
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
Audio Codec Speech processing Universal PERformance Benchmark
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Training code for FAcodec presented in NaturalSpeech3
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis
Comprehensive quantitative comparison of lossless and lossy audio codecs
JAX Implementations of Descript Audio Codec and EnCodec
LLM-Codec: Neural Audio Codec Meets Language Model Objectives
A pure Python implementation of Google's ViSQOL (Virtual Speech Quality Objective Listener) for objective audio/speech quality assessment.
Extended Opus Encoding Technical
Add more settings for opus from previous PyOgg version
Open research toward end-to-end spoken dialogue systems. Ships SoviaMate-Codec — a neural audio codec for LLM integration with ASR-constrained encoding, enhancement training, and zero-shot speaker adaptation.
🔊 Build audio tokenizers and detokenizers for speech large language models with LongCat-Audio-Codec to enhance audio processing and understanding.
A Ultimate Audio Coding prototype in python
Standalone inference for Marco Voice v16 v2 (causal S3 tokens 1024@25Hz + Emosphere flow)
APU-Codec: Neural audio codec from source, optimized for AMD APU tri-processor inference (NPU encoder, GPU decoder)
Real-time semantic audio codec achieving 300bps bandwidth via gen AI reconstruction.
Add a description, image, and links to the audio-codec topic page so that developers can more easily learn about it.
To associate your repository with the audio-codec topic, visit your repo's landing page and select "manage topics."