14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.
-
Updated
Apr 1, 2026 - Python
14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.
[TMLR 2026] Survey: https://arxiv.org/pdf/2507.20198
📚 Collection of token-level model compression resources.
Hook-based token compressor for 5 AI CLI hosts (Claude Code, Copilot CLI, OpenCode, Gemini CLI, Codex CLI). Up to 95% bash compression, signature-mode for code reads, cross-call dedup, MCP server, self-teaching protocol. Zero runtime deps.
Token-Oriented Object Notation - A compact data format for reducing token consumption when sending structured data to LLMs (PHP implementation)
The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"
Open-source AI gateway written in Rust, with token compression for Claude Code, Codex... and any other LLM client.
You say it. AutoCode ships it. 46 skills. Code to deployment in one session. I-Lang v3.0 powered. Free forever.
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"
ultra-lightweight, mathematically robust prompt compression middleware
⚡ Cut Claude Code context 60-90%. Live stdout today, session-history compression coming v0.2.
[ICLR 2026] MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding
The browser engine for agents. HTML in, Semantic Object Model out. 10x token compression, V8 JS rendering, CDP compatible. Apache-2.0.
Lossless-first prompt compression for JSON, YAML, CSV, and Markdown. Library, CLI, MCP server, desktop app, and browser extension.
😎 Awesome papers on token redundancy reduction
This repo integrates DyCoke's token compression method with VLMs such as Gemma3 and InternVL3
[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models.
Add a description, image, and links to the token-compression topic page so that developers can more easily learn about it.
To associate your repository with the token-compression topic, visit your repo's landing page and select "manage topics."