Skip to content
View suzhenghang's full-sized avatar
  • Guang Zhou

Block or report suzhenghang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Auto party rave lights

Python 25 6 Updated Dec 3, 2025

The most powerful local music generation model that outperforms most commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.

Python 7,386 822 Updated Mar 7, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 275,473 52,572 Updated Mar 7, 2026

A high quality and fast TTS repository

Python 505 42 Updated Dec 22, 2025

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,357 290 Updated Jan 5, 2026

A Python 3 module to control DMX using OpenDMX or uDMX - Featuring fixture profiles, built-in effects and a web control panel.

Python 139 24 Updated Feb 5, 2024

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Python 913 80 Updated Feb 25, 2026

UNOFFICIAL - A tool converting sound input to OSC trigger signals.

C++ 115 11 Updated Mar 20, 2019

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

Python 331 56 Updated Nov 20, 2024

Intelligent, real-time, audio-responsive DMX light control.

Jupyter Notebook 10 Updated Feb 13, 2026
Python 10,405 683 Updated Feb 9, 2026

Official implementation of YingMusic-SVC.

Python 121 12 Updated Dec 29, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 7,498 449 Updated Feb 10, 2026

CogView4, CogView3-Plus and CogView3(ECCV 2024)

Python 1,105 80 Updated Mar 29, 2025

Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

C++ 2,685 238 Updated Jan 22, 2026

This is the official implementation of our paper: "MiniMax-Remover: Taming Bad Noise Helps Video Object Removal"

Python 543 51 Updated Jul 27, 2025

基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.

Python 9,700 1,211 Updated Dec 3, 2025

✨ AsrTools: Smart Voice-to-Text Tool | Efficient Batch Processing | User-Friendly Interface | No GPU Required | Supports SRT/TXT Output | Turn your audio into accurate text in an instant!

Python 3,104 291 Updated Nov 25, 2025

A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows

Python 228 18 Updated Jan 8, 2026

一款专注于Ai翻译的工具,一键自动翻译RPG SLG游戏,Epub TXT小说,PDF Word MD文档,Srt Vtt Lrc字幕等等复杂长文本。

Python 5,281 336 Updated Feb 27, 2026

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,710 242 Updated Dec 30, 2025

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Python 875 58 Updated Feb 13, 2026

Repository for training models for music source separation.

Python 1,192 180 Updated Feb 4, 2026

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 3,202 417 Updated Dec 11, 2025

Learn to build and deploy local Visual Language Models for Edge AI

Jupyter Notebook 371 44 Updated Oct 30, 2025

🧑‍🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.

7,646 742 Updated Mar 7, 2026

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

Python 636 51 Updated Feb 26, 2026

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Python 902 60 Updated Oct 15, 2025

Official Implementation of "Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion"

Python 24 6 Updated Nov 12, 2025

Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.

Python 17 3 Updated May 9, 2025
Next