Telegram Speech-to-Text & Event Bot

Telegram bot for group interaction, voice message transcription, and (optionally) Google Calendar event creation from event poster photos. Uses Google Gemini for speech-to-text and text generation. Designed for easy Docker deployment and notifies an admin on critical errors.

Features

Listens for voice and text messages in Telegram groups
Transcribes speech to text using Gemini
Replies with the transcription or an error message
Supports Gemini for text generation and context-aware replies
Supports context memory optimization with recent-window + optional long-term summaries
Can auto-react to messages using Gemini (optional)
Optional multi-model Gemini routing with per-model RPM/RPD limits
Can create Google Calendar events from event poster photos (optional)
Notifies admin on critical errors
Logs to stdout and temporary files
Access control via allowed chat/user IDs
Configurable via environment variables

Requirements

Docker (recommended) or Python 3.9+
Telegram bot token
Allowed chat/user IDs (for access control)
Google Gemini API key (for Gemini features)
(Optional) Admin Telegram chat ID (for error notifications)
(Optional) Google Calendar credentials and calendar ID (for event creation)

Setup

1. Create a Telegram Bot

Talk to @BotFather on Telegram
Create a new bot and get the token
(Optional) Disable privacy mode to allow group message access

2. Clone the Repository

git clone <your-repo-url>
cd kabanus

3. Configure Environment Variables

Create a .env file or set environment variables:

# Core
TELEGRAM_BOT_TOKEN=your-telegram-bot-token
ALLOWED_CHAT_IDS=comma,separated,chat,ids
ADMIN_CHAT_ID=your-admin-chat-id           # Optional, for error notifications

# Gemini and AI behavior
GEMINI_API_KEY=your-gemini-api-key         # Required for Gemini support
GOOGLE_API_KEY=your-google-api-key         # Optional, defaults to GEMINI_API_KEY if unset
GEMINI_MODEL=gemini-2.0-flash              # Optional, default is gemini-2.0-flash
GEMINI_MODELS=[{"name":"gemini-2.5-flash","rpm":60,"rpd":1000}] # Optional JSON list ordered by preference, overrides GEMINI_MODEL
THINKING_BUDGET=0                          # Optional, Gemini thinking budget
USE_GOOGLE_SEARCH=false                    # Optional, enable Gemini grounding with Google Search
SYSTEM_INSTRUCTIONS_PATH=system_instructions.txt # Optional, path (relative to src/) for system prompt
LANGUAGE=ru                                # Optional, bot response language (default: ru)
TOKEN_LIMIT=500000                         # Optional, context token limit

# Features
ENABLE_MESSAGE_HANDLING=true               # Enable text/voice message handling (default: false)
ENABLE_SCHEDULE_EVENTS=true                # Enable event creation from photos (default: false)
REACTION_ENABLED=false                     # Optional, enable auto-reactions
REACTION_COOLDOWN_SECS=600                 # Optional, seconds between reactions
REACTION_DAILY_BUDGET=50                   # Optional, max reactions per day
REACTION_MESSAGES_THRESHOLD=10             # Optional, messages between reactions
REACTION_GEMINI_MODEL=gemini-2.0-flash     # Optional, defaults to GEMINI_MODEL

# Google Calendar
GOOGLE_CALENDAR_ID=your-calendar-id        # Required for event creation
GOOGLE_CREDENTIALS_PATH=path/to/creds.json # or use GOOGLE_CREDENTIALS_JSON
GOOGLE_CREDENTIALS_JSON='{"type": "..."}'  # Optional, inlined credentials JSON

# Bot behavior
BOT_ALIASES=bot,бот,ботик                  # Optional, comma-separated aliases
CHAT_MESSAGES_STORE_PATH=messages.jsonl    # Optional, history message store file

# Memory/context optimization
MEMORY_ENABLED=true                        # Optional, enable structured context builder
MEMORY_RECENT_TURNS=20                     # Optional, keep last N messages in recent section
MEMORY_RECENT_BUDGET_RATIO=0.85            # Optional, token budget share for recent dialogue
MEMORY_SUMMARY_ENABLED=false               # Optional, enable long-term summary section
MEMORY_SUMMARY_BUDGET_RATIO=0.15           # Optional, token budget share for summaries
MEMORY_SUMMARY_CHUNK_SIZE=16               # Optional, messages per summary chunk
MEMORY_SUMMARY_MAX_ITEMS=4                 # Optional, max summary items injected per request
MEMORY_SUMMARY_MAX_CHUNKS_PER_RUN=1        # Optional, new chunks summarized per runtime call

# Runtime and debugging
DEBUG_MODE=true                            # Optional, enable debug logging
THIRD_PARTY_LOG_LEVEL=WARNING              # Optional, external libs log level (httpx/httpcore/telegram/google_genai)
DOTENV_PATH=path/to/.env                   # Optional, override .env location
SETTINGS_CACHE_TTL=1.0                     # Optional, settings cache TTL in seconds
SETTINGS_REFRESH_INTERVAL=1.0              # Optional, refresh interval (used if job enabled)

4. Build and Run with Docker

docker build -t kabanus .
docker run --env-file .env kabanus

5. Run Locally (for development)

Install dependencies:

pip install -r requirements.txt

Run the bot:

python -m src.main

Usage

If ENABLE_SCHEDULE_EVENTS=true, send a photo of an event poster to create a Google Calendar event (requires calendar credentials and ID).
If ENABLE_MESSAGE_HANDLING=true, send a voice, text, or image to interact with the bot. Mention the bot or reply to its message for a response.
Use /summary (alias: /view_summary) to inspect per-chat summary chunks created by memory summary. Examples: /summary (first 3 chunks), /summary 5, /summary tail 5, /summary index 42, /summary budget api, /summary --head 10 --grep budget. /summary help shows command usage. Summary command requests and responses are not saved into chat history.
If GEMINI_MODELS is set, the bot tries models in order of desirability and skips any that hit RPM/RPD limits.
If REACTION_ENABLED=true, the bot may react to messages using Gemini within the configured budget/cooldown.

Utilities

scripts/dump_chat.py: Dump Telegram chat history to JSONL (see script for usage).
scripts/backfill_summaries.py: Backfill *.summary.json from existing JSONL history.
scripts/view_summary.py: Inspect summary files quickly from CLI.
scripts/README.md: Detailed script usage and examples.

Memory and Backfill

The bot stores raw messages as JSONL and can optionally use long-term compressed summaries:

Raw history file pattern: messages_<chat_id>.jsonl
Summary file pattern: messages_<chat_id>.summary.json

When MEMORY_SUMMARY_ENABLED=true, context assembly can include both:

recent dialogue window (verbatim)
relevant long-term summary chunks

Backfill existing history

Use backfill when you already have large history and want summary files immediately.

Example with local Ollama:

. .venv/bin/activate
set -a && source dev.stack.env && set +a
MEMORY_SUMMARY_ENABLED=true PYTHONPATH=. python3 -m scripts.backfill_summaries \
  --chat-id=-{chat_id} \
  --source-jsonl src/data/messages_-{chat_id}.jsonl \
  --provider ollama \
  --ollama-url http://127.0.0.1:11434/api/generate \
  --ollama-model gemma3:4b

For a quick experiment on first chunks only:

MEMORY_SUMMARY_ENABLED=true PYTHONPATH=. python3 -m scripts.backfill_summaries \
  --chat-id=-{chat_id} \
  --source-jsonl src/data/messages_-{chat_id}.jsonl \
  --force-rebuild \
  --provider ollama \
  --max-chunks 20

VS Code Debugging

A .vscode/launch.json is provided. Use the "Run Telegram Bot (src.main)" or "Debug Unit Tests" configurations from the Run & Debug panel.

Notes

All imports in src/ use relative imports (e.g., from .config import ...).
Do not run files in src/ directly; always use the -m module syntax from the project root.
Gemini support requires a valid API key from Google AI Studio.
Google Calendar event creation requires a valid calendar ID and service account credentials.
ALLOWED_CHAT_IDS is required; if empty, the bot denies all users.
DEBUG_MODE controls your app debug logs (src.*, __main__).
Use THIRD_PARTY_LOG_LEVEL to reduce dependency noise (default: WARNING).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.vscode		.vscode
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pylintrc		.pylintrc
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telegram Speech-to-Text & Event Bot

Features

Requirements

Setup

1. Create a Telegram Bot

2. Clone the Repository

3. Configure Environment Variables

4. Build and Run with Docker

5. Run Locally (for development)

Usage

Utilities

Memory and Backfill

Backfill existing history

VS Code Debugging

Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Telegram Speech-to-Text & Event Bot

Features

Requirements

Setup

1. Create a Telegram Bot

2. Clone the Repository

3. Configure Environment Variables

4. Build and Run with Docker

5. Run Locally (for development)

Usage

Utilities

Memory and Backfill

Backfill existing history

VS Code Debugging

Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages