-
Shanghai Jiao Tong University
- Shanghai
-
13:42
(UTC -12:00)
Stars
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
[ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding
The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".
Supercharge Your LLM with the Fastest KV Cache Layer
Turn any glasses into AI-powered smart glasses
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Codebase for the Recognize Anything Model (RAM)
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
[ICCV 2025] StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
[CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
The simplest, fastest repository for training/finetuning medium-sized GPTs.
🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
[ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)