| title | Text-to-Speech Plugin |
|---|---|
| sidebarTitle | TTS |
| description | Text-to-speech plugin for Milady — ElevenLabs, OpenAI TTS, and Edge TTS voice synthesis. |
The Text-to-Speech (TTS) plugin enables Milady agents to synthesize speech from text, providing voice responses through ElevenLabs, OpenAI TTS, or Microsoft Edge TTS.
Package: @elizaos/plugin-tts
The TTS plugin registers a TEXT_TO_SPEECH model handler and actions that allow agents to generate audio from text. Generated audio can be played in voice channels (Discord, Telegram voice), saved to files, or streamed to the client.
milady plugins install tts{
"features": {
"tts": true
}
}High-quality voice synthesis with voice cloning and emotion control.
Package: @elizaos/plugin-elevenlabs
| Environment Variable | Required | Description |
|---|---|---|
ELEVENLABS_API_KEY |
Yes | ElevenLabs API key from elevenlabs.io |
ELEVENLABS_VOICE_ID |
No | Voice ID (default: Rachel) |
ELEVENLABS_MODEL_ID |
No | Model ID (default: eleven_turbo_v2_5) |
{
"features": {
"tts": {
"enabled": true,
"provider": "elevenlabs",
"voiceId": "21m00Tcm4TlvDq8ikWAM",
"modelId": "eleven_turbo_v2_5"
}
}
}{
"features": {
"tts": {
"enabled": true,
"provider": "openai",
"voice": "alloy",
"model": "tts-1"
}
}
}Requires OPENAI_API_KEY.
Package: @elizaos/plugin-edge-tts
Microsoft Edge TTS is free and requires no API key. Synthesis is performed through Microsoft’s Edge TTS cloud (node-edge-tts talks to Microsoft’s service). Quality is lower than ElevenLabs but suitable for development.
Milady default: When @elizaos/plugin-agent-orchestrator is loaded, Milady automatically adds @elizaos/plugin-edge-tts so swarm / PTY paths that call TEXT_TO_SPEECH have a handler. That means a default install with the orchestrator can make outbound calls to Microsoft whenever those code paths run TTS—even if you never enabled “TTS” in features.
Opt out of auto-load: set MILADY_DISABLE_EDGE_TTS=1 (or ELIZA_DISABLE_EDGE_TTS=1) in the environment or ~/.milady/.env, or disable the plugin entry: plugins.entries["edge-tts"].enabled: false. See Environment variables (MILADY_DISABLE_EDGE_TTS).
{
"features": {
"tts": {
"enabled": true,
"provider": "edge-tts",
"voice": "en-US-AriaNeural"
}
}
}| Voice ID | Name | Description |
|---|---|---|
21m00Tcm4TlvDq8ikWAM |
Rachel | Calm, professional female |
AZnzlk1XvdvUeBnXmlld |
Domi | Strong female |
EXAVITQu4vr4xnSDxMaL |
Bella | Soft female |
ErXwobaYiN019PkySvjV |
Antoni | Well-rounded male |
MF3mGyEYCl7XYWbV9V6O |
Elli | Emotional female |
TxGEqnHWrfWFTfGW9XjX |
Josh | Deep male |
Browse all voices at elevenlabs.io/voice-library.
| Model ID | Description |
|---|---|
eleven_turbo_v2_5 |
Fastest, lowest latency |
eleven_turbo_v2 |
Fast, good quality |
eleven_multilingual_v2 |
Multilingual support |
eleven_monolingual_v1 |
English only, high quality |
| Voice | Description |
|---|---|
alloy |
Neutral |
echo |
Male |
fable |
British male |
onyx |
Deep male |
nova |
Female |
shimmer |
Soft female |
| Model | Description |
|---|---|
tts-1 |
Faster, lower latency |
tts-1-hd |
Higher quality |
| Action | Description |
|---|---|
SPEAK |
Convert text to speech and play/return audio |
GENERATE_AUDIO |
Generate an audio file from text |
SET_VOICE |
Change the active voice |
After the plugin is loaded:
"Read this article to me"
"Say the following in a cheerful voice: Welcome to Milady!"
"Generate an audio file from this text"
When combined with Discord or Telegram connectors, the TTS plugin enables voice channel support:
- Discord: Agent joins voice channels and speaks responses
- Telegram: Agent sends voice messages as
.oggfiles
| Format | Use Case |
|---|---|
mp3 |
Streaming, Discord, general |
ogg_vorbis |
Telegram voice messages |
pcm |
Low-latency streaming |
wav |
Archival, high quality |
- Image Generation Plugin — Image synthesis
- Computer Use Plugin — Desktop automation
- Media Generation Guide — Full media generation guide