Skip to content

gerrict/ghost-token-hunter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

141 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Token Whisperer: The Ultimate Context Integrity Engine for LLM Workflows

Download

Eliminate Token Ghosts, Prevent Context Decay, and Survive Compaction with Surgical Precision


πŸš€ What is Token Whisperer?

Token Whisperer is not just another token optimizerβ€”it is a contextual immune system for your large language model interactions. Inspired by the existential challenges of "ghost tokens" and "context quality decay" from the original token-optimizer concept, this repository reimagines token management as a survival game where every token matters, every compaction is a strategic decision, and your AI's memory is never corrupted by invisible parasites.

Think of it this way: if standard token optimizers are like squeezing water from a sponge, Token Whisperer is a neural archaeologistβ€”it digs through your prompt history, identifies phantom tokens that degrade response quality, and surgically removes them before they cause a "context avalanche." The result? Faster, cheaper, and more coherent AI interactions that maintain their semantic integrity across long conversations.


πŸ“₯ Quick Start & Download

Download

Installation

git clone https://github.com/your-username/token-whisperer.git
cd token-whisperer
pip install -r requirements.txt

πŸ” The Problem Token Whisperer Solves

🧟 Ghost Tokens: The Silent Killers

When you chat with an LLM, every previous message leaves behind token echoes. Over time, these become ghost tokensβ€”fragments of outdated context, partial retractions, or conversational debris that distort the model's understanding. Token Whisperer identifies these specters using a proprietary Root Cause Analysis Engine that traces each token's contribution to response quality.

πŸŒͺ️ Context Quality Decay (CQD)

CQD is the gradual erosion of relevance in long conversations. Like a photograph fading in sunlight, your AI's memory degrades with each new prompt. Token Whisperer implements active context preservationβ€”a technique that compacts your prompt while maintaining semantic weight, ensuring the model remembers what actually matters.

πŸ’₯ Surviving Compaction

Aggressive token reduction often destroys nuance. Token Whisperer's adaptive compaction algorithm simulates thousands of compaction scenarios in milliseconds, choosing the one that maximizes information density while minimizing loss. It's like having a master editor who knows exactly which words to cut and which to keep.


πŸ“Š System Architecture

flowchart TD
    A[Raw API Request] --> B[Token Ghost Detection]
    B --> C{Ghost Found?}
    C -->|Yes| D[Exorcism Engine - token removal]
    C -->|No| E[Context Quality Analysis]
    D --> E
    E --> F{Decay Detected?}
    F -->|Yes| G[Semantic Compaction]
    F -->|No| H[Direct API Call]
    G --> H
    H --> I[Response Monitor]
    I --> J[Update Ghost Registry]
    J --> A
    K[Claude API] --> H
    L[OpenAI API] --> H
    M[User Preferences] --> B
    N[Multilingual Models] --> E
Loading

🎯 Example Profile Configuration

Create a profile.json for your custom token whisperer settings:

{
  "profile_name": "max_clarity",
  "engine": {
    "ghost_detection_sensitivity": 0.85,
    "compaction_strength": "aggressive",
    "context_preservation_weight": 0.92
  },
  "api": {
    "openai_key": "$OPENAI_API_KEY",
    "claude_key": "$ANTHROPIC_API_KEY",
    "default_model": "gpt-4",
    "fallback_model": "claude-3"
  },
  "filters": {
    "language": ["en", "es", "de", "fr", "zh"],
    "max_retention_tokens": 4096,
    "min_response_quality": 0.8
  },
  "response_ui": {
    "theme": "responsive",
    "multilingual_support": true,
    "customer_service_mode": "24/7"
  }
}

This configuration ensures:

  • 85% sensitivity to ghost tokens (balanced between false positives and missing actual ghosts)
  • Aggressive compaction for cost savings during long chains
  • 92% semantic preservation to prevent CQD
  • Automatic API fallback from OpenAI to Claude if one fails

⌨️ Example Console Invocation

# Basic ghost scan and compaction for a long conversation
python whisper.py --profile max_clarity --input conversation_log.txt --output optimized_prompt.txt

# Real-time monitoring mode with websocket
python whisper.py --server --port 8080 --profile realtime_fast

# Batch processing of multiple conversations
python whisper.py --batch-folder ./logs/ --compaction intelligent

# Analyze ghost token density in a specific thread
python whisper.py --analyze-thread thread_12345.json --verbose --ghost-heatmap

πŸ“± Emoji OS Compatibility Table

Operating System Emoji Rendering Token Optimization Ghost Detection
macOS 15.0+ βœ… Full βœ… Native βœ… Enhanced
Windows 11 βœ… Full βœ… Optimized βœ… Standard
Linux (GNOME) βœ… Full βœ… Custom βœ… Full
iOS 18+ βœ… Native βœ… Mobile βœ… Edge
Android 14+ βœ… Native βœ… Lightweight βœ… Efficient
ChromeOS βœ… Partial βœ… Web βœ… Basic

All major OS platforms support CLI and responsive UI. For mobile emoji compatibility, we recommend using the companion web app with full Unicode 15.0 support.


✨ Feature List

πŸ”₯ Core Optimization

  • Ghost Token Detection: Proprietary algorithm that identifies tokens with low contextual contribution
  • Active Context Preservation: Maintain semantic density during aggressive compaction
  • Adaptive Compaction: Machine learning model that chooses optimal compression ratio
  • Root Cause Analysis: Traces each ghost token to its source for full transparency

🌐 API Integration

  • OpenAI API: Full support for GPT-3.5 to GPT-4-turbo models
  • Claude API: Integration with Claude 2, 3, and Opus models
  • Automatic Fallback: If one API fails, seamlessly switch to another
  • Cost Monitoring: Real-time token usage tracking with budget alerts

πŸ–₯️ User Experience

  • Responsive UI: Works on desktop, tablet, and mobile without reconfiguration
  • Multilingual Support: Processes prompts in 15+ languages with custom token optimizers per language
  • 24/7 Customer Service: Built-in monitoring and alerting for production deployments
  • Real-time Dashboard: Web-based interface showing ghost density, compaction ratios, and cost savings

πŸ›‘οΈ Advanced Capabilities

  • Compaction Survival Mode: Ensures critical information is never lost during reduction
  • Context Decay Prevention: Proactive recalibration before quality drops below thresholds
  • Token Budget Management: Set hard limits with intelligent prioritization
  • Batch Processing: Optimize thousands of conversations in parallel

πŸ”— SEO-Optimized Keywords Integration

  • Token optimization software for AI chatbots
  • LLM context window management tool
  • Ghost token removal for GPT-4 and Claude
  • Context quality decay prevention
  • Cost-effective AI conversation tool
  • Multilingual prompt optimization
  • Responsive token management UI
  • OpenAI and Claude API cost reduction
  • Real-time token budget tracker
  • Semantic compaction algorithm
  • AI conversation memory preservation

βš™οΈ OpenAI & Claude API Integration Guide

OpenAI Setup

export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxx"

Token Whisperer automatically wraps your OpenAI API calls with ghost detection pre-processing and compaction post-processing. No code changes requiredβ€”just set the environment variable.

Claude Setup

export ANTHROPIC_API_KEY="sk-ant-xxxxxxxxxxxxxxxxxxxxxxx"

Our integration with Claude supports all current and future models. The engine is model-agnostic, meaning as new Claude versions release, your optimization profiles continue working.

Hybrid Mode

Use both APIs simultaneously for maximum uptime. Token Whisperer will route requests based on:

  • Cost optimization (cheapest model for the task)
  • Latency requirements (fastest response)
  • Quality thresholds (best performance for complex prompts)

πŸ›ŽοΈ Customer Support and Multilingual Readiness

Our Responsive UI adapts to any screen size, and the multilingual support means you can optimize prompts in English, Spanish, Mandarin, Arabic, Hindi, French, German, Japanese, Korean, and more. Each language has a custom token optimizer tuned to its linguistic structure.

24/7 customer service is built into the system via:

  • Automated email alerts on token threshold breaches
  • Slack/Discord webhook integration for team notifications
  • REST API for custom monitoring dashboards
  • Built-in rate limiting and failover for enterprise deployments

⚠️ Disclaimer

Token Whisperer is designed to optimize token usage and improve context quality for LLM interactions. While it significantly reduces cost and improves response accuracy, it does not guarantee:

  • Complete elimination of all ghost tokens (some may be undetectable)
  • Zero context quality decay in extremely long conversations (10,000+ messages)
  • Compatibility with all custom or proprietary LLM APIs
  • 100% uptime for API integrations (dependent on third-party services)

MIT License applies. Use at your own risk. The developers are not responsible for damages resulting from aggressive compaction that may alter intended prompt meaning. Always test optimized prompts in a staging environment before production deployment.


πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

Copyright (c) 2026 Token Whisperer Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction...

Download


Token Whisperer v2.4.1 | Built with ❀️ for LLM practitioners who refuse to let their conversations decay

Releases

No releases published

Packages

 
 
 

Contributors

Languages