A command-line Python script to generate high-quality speech from text using the Piper TTS engine.
This utility provides a simple, configurable interface to synthesize speech for multiple languages and can be used as a standalone tool or as a backend for other applications. All paths and model settings are managed in a central config.ini file for easy customization.
- Configuration-Driven: All paths and settings are managed in an easy-to-edit
config.inifile. - Portable: Relative paths allow the entire project folder to be moved without breaking.
- Extensible: Add new languages and voices by simply updating the configuration file.
- Flexible Input: Synthesize text provided directly as an argument or from the system clipboard.
- File Output: Save the generated audio to a specified file path.
- Standalone Playback: Instantly play back the generated audio for quick tests.
- Windows 11: These instructions are tailored for Windows 11.
- Python 3: Python 3 must be installed on your system.
- FFplay (Optional): For direct audio playback,
ffplay.exe(part of FFmpeg) should be available on your system.
Step 1: Clone this Repository
git clone https://github.com/voothi/202412060110-piper-tts.git
cd 202412060110-piper-ttsStep 2: Download Piper Engine and Voices
Go to the Releases Page and download the following two files:
piper-windows-amd64.zippiper-voices-de-en-ru.zip
Step 3: Unzip and Organize Files
You must place the contents of the archives into specific folders inside the cloned repository directory.
- Inside the
202412060110-piper-ttsfolder, create a new folder namedpiper. - Extract the contents of
piper-windows-amd64.zipdirectly into this newpiperfolder. - Back in the main project folder, create another new folder named
piper-voices. - Extract the contents of
piper-voices-de-en-ru.zipinto thepiper-voicesfolder.
Your final folder structure should look like this:
202412060110-piper-tts/
├── piper/
│ ├── piper.exe
│ └── ... (other required files)
├── piper-voices/
│ ├── de/
│ ├── en/
│ └── ru/
├── config.ini
├── config.ini.template
├── piper_tts.py
└── README.md
Step 4: Configure the Script
- Find the file
config.ini.templatein the project directory. - Make a copy of this file and rename the copy to
config.ini. - Open
config.iniand edit the paths to match your system. See the Configuration section below for details.
Step 5: Install Python Dependencies
pip install pyperclipStep 6: Test the Installation
Run a test command. The --lang argument is now optional and will use the default from your config file.
python piper_tts.py --text "Hello, world."You should hear the synthesized audio.
All script settings are managed in config.ini. Note that config.ini is ignored by Git; you should copy config.ini.template to config.ini for your local setup.
The most important setting to check is ffplay_executable.
[paths]section:piper_executable: Path topiper.exerelative to the project root. The default should be correct if you followed the setup guide.voices_directory: Path to thepiper-voicesfolder. The default should be correct.ffplay_executable: You must provide the full, absolute path toffplay.exeon your system for audio playback to work. If you don't need playback, you can leave this empty.
[tts_settings]section:supported_languages: A comma-separated list of language codes you want to use.default_lang: The language to use if you don't specify one with the--langflag.
[voice_*]sections:- Each section defines the model and config files for a specific language.
model: Path to the.onnxmodel file (relative tovoices_directory).config: Path to the.onnx.jsonconfig file (relative tovoices_directory).speaker(Optional): The default speaker ID for this model (e.g.,speaker = 1for Mykyta in the Ukrainian model). Defaults to0if omitted.
| Argument | Description | Required |
|---|---|---|
--lang |
Language code. If omitted, uses the default from config.ini. |
No |
--text |
The text string to synthesize. | No |
--clipboard |
If present, use the text currently in the system clipboard as input. | No |
--output-file |
Full path to save the output .wav file. Disables auto-playback. |
No |
--speaker |
The speaker ID to use (default is 0). |
No |
Note: You must provide either --text or --clipboard.
Example Commands:
# Synthesize German text and play it back
python piper_tts.py --lang de --text "Hallo, wie geht es Ihnen?"
# Use the default language (e.g., 'en') with text from the clipboard
python piper_tts.py --clipboard
# Synthesize English text and save it to a file (no playback)
python piper_tts.py --lang en --text "This is a test." --output-file "C:\temp\test_audio.wav"
# Synthesize Ukrainian text using the "Mykyta" voice (Speaker 1)
python piper_tts.py --lang uk --speaker 1 --text "Слава Україні!"This script serves as the official backend for the gTTS Player with Piper Fallback for Anki add-on.
Beyond Anki, this script can be integrated into your desktop environment to provide system-wide text-to-speech functionality. By using the provided AutoHotkey v2 scripts, you can select text in any application and have it read aloud with a keyboard shortcut.
- tts.ahk: A script that triggers
piper_tts.pyto read the currently selected text using different hotkeys for each language (e.g., English, German, Russian). - kill-ffplay.ahk: A utility hotkey to immediately terminate the audio playback, useful for stopping long sentences.
This project is part of the Kardenwort environment, designed to create a focused and efficient learning ecosystem.