marcoyang1998

Xiaoyu Yang marcoyang1998

Speech recognition, Multi model

31 followers · 6 following

University of Cambridge
Cambridge

Achievements

x2 x2

Achievements

x2 x2

Highlights

Stars

yfyeung / CLSP

Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training

68 1 Updated Feb 7, 2026

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 11,291 1,663 Updated Mar 1, 2026

k2-fsa / Flow2GAN

Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation

Python 135 8 Updated Jan 21, 2026

xiaomi-research / xares-llm

XARES-LLM

Python 54 3 Updated Feb 11, 2026

facebookresearch / omnilingual-asr

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,711 242 Updated Dec 30, 2025

k2-fsa / ZipVoice

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 871 117 Updated Dec 2, 2025

ddlBoJack / MMAR

[NeurIPS 2025] Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Python 197 4 Updated Feb 25, 2026

xiaomi-research / r1-aqa

🤗 R1-AQA Model: mispeech/r1-aqa

Python 314 29 Updated Mar 28, 2025

XiaoMi / subllm

This repository is the official implementation of the ECAI 2024 conference paper SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM

Python 68 4 Updated Aug 13, 2024

SpeechColab / GigaSpeech2

An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement

Python 185 12 Updated Sep 1, 2025

huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 4,053 349 Updated Jan 8, 2025

k2-fsa / libriheavy

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Python 216 12 Updated Sep 10, 2024

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 11,176 1,096 Updated Nov 18, 2024

bytedance / SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

1,391 111 Updated Feb 3, 2026

meta-llama / llama

Inference code for Llama models

Python 59,199 9,818 Updated Jan 26, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 157,533 32,321 Updated Mar 7, 2026

k2-fsa / divide_lm

Python 4 3 Updated Apr 25, 2023

k2-fsa / text_search

Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup

Python 79 15 Updated Jun 30, 2025

lhotse-speech / lhotse

Tools for handling multimodal data in machine learning projects.

Python 1,118 262 Updated Mar 7, 2026

k2-fsa / sherpa

Speech-to-text server framework with next-gen Kaldi

C++ 886 142 Updated Mar 7, 2026

marcoyang1998 / icefall

Forked from k2-fsa/icefall

Python 3 Updated Feb 12, 2026

k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 10,661 1,205 Updated Mar 6, 2026

k2-fsa / sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, L…

C++ 1,644 206 Updated Oct 20, 2025

k2-fsa / multi_quantization

Python 46 10 Updated Nov 2, 2023

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 15,339 5,356 Updated Sep 22, 2025

k2-fsa / icefall

Python 1,370 400 Updated Mar 3, 2026

k2-fsa / k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Cuda 1,306 233 Updated Mar 6, 2026

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,755 2,385 Updated Mar 5, 2026

imfunniee / gitfolio

personal website + blog for every github user

JavaScript 6,751 681 Updated Feb 19, 2022

graykode / gpt-2-Pytorch

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

Python 1,014 233 Updated Jul 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiaoyu Yang marcoyang1998

Achievements

Achievements

Highlights

Block or report marcoyang1998

Stars

yfyeung / CLSP

speechbrain / speechbrain

k2-fsa / Flow2GAN

xiaomi-research / xares-llm

facebookresearch / omnilingual-asr

k2-fsa / ZipVoice

ddlBoJack / MMAR

xiaomi-research / r1-aqa

XiaoMi / subllm

SpeechColab / GigaSpeech2

huggingface / distil-whisper

k2-fsa / libriheavy

salesforce / LAVIS

bytedance / SALMONN

meta-llama / llama

huggingface / transformers

k2-fsa / divide_lm

k2-fsa / text_search

lhotse-speech / lhotse

k2-fsa / sherpa

marcoyang1998 / icefall

k2-fsa / sherpa-onnx

k2-fsa / sherpa-ncnn

k2-fsa / multi_quantization

kaldi-asr / kaldi

k2-fsa / icefall

k2-fsa / k2

espnet / espnet

imfunniee / gitfolio

graykode / gpt-2-Pytorch