A comprehensive music agent framework built on smolagents, integrating state-of-the-art music AI models for understanding, generation, and interaction.
- ChatMusician Integration: Natural language music analysis and understanding
- Music Theory Analysis: Automatic analysis of musical structures, harmony, and form
- Audio Understanding: Content analysis of audio files
- Symbolic Music Generation: ABC notation generation using NotaGen
- Audio Generation: High-quality audio synthesis using Stable Audio Open
- Conditional Generation: Generate music based on text prompts, styles, and constraints
- smolagents Integration: Intelligent agent system that decides which tools to use
- Multi-modal Support: Text, audio, and symbolic music inputs
- Tool Orchestration: Seamless integration between different music AI models
- Gradio Web Interface: User-friendly web interface for interaction
- Verovio Integration: Beautiful sheet music rendering
- Audio Playback: Integrated audio player for generated content
- Docker Support: Easy deployment with Docker containers
- Cloud Ready: Scalable deployment options
- API Endpoints: RESTful API for programmatic access
# Build and run with Docker Compose
docker-compose up --build
# Access the interface at http://localhost:7860# Clone the repository
git clone https://github.com/manoskary/music-agent.git
cd music-agent
# Install dependencies
pip install -e .
# Set up environment variables
cp .env.example .env
# Edit .env with your configuration
# Run the application
music-agent serveAfter installation, you can run our demonstration scripts to see the AI capabilities:
# Run the success demo to verify everything is working
python demos/demo_success.py
# Run the complete AI demo to see all features
python demos/demo_ai_complete.py
# See all available demos
ls demos/- Open your browser to
http://localhost:7860 - Upload audio files or enter text prompts
- Select the type of music task you want to perform
- Let the agent decide which tools to use and generate results
from music_agent import MusicAgent
# Initialize the agent
agent = MusicAgent()
# Generate music from text
result = agent.run("Create a peaceful piano piece in C major")
# Analyze uploaded audio
analysis = agent.run("Analyze the harmony and structure of this piece", audio_file="song.wav")
# Generate ABC notation
abc_notation = agent.run("Convert this melody to ABC notation", audio_file="melody.wav")# Generate music from text
music-agent generate --text "Create a jazz composition for piano and saxophone"
# Analyze audio file
music-agent analyze --audio "path/to/song.wav"
# Convert between formats
music-agent convert --input "song.abc" --output "song.wav"The Music Agent Framework consists of several key components:
- MusicAgent: Main agent class built on smolagents
- Tool Router: Intelligent routing between different music tools
- Context Manager: Maintains conversation and task context
- ChatMusicianTool: Music understanding and analysis
- NotaGenTool: Symbolic music generation in ABC notation
- StableAudioTool: High-quality audio generation
- AudioAnalysisTool: Audio content understanding
- VerovioTool: Sheet music visualization
- Gradio UI: Web-based user interface
- FastAPI Server: RESTful API endpoints
- CLI: Command-line interface
Create a .env file in the project root:
# Model configurations
CHATMUSICIAN_MODEL_ID=m-a-p/ChatMusician
NOTAGEN_MODEL_PATH=./models/notagen
STABLE_AUDIO_MODEL_ID=stabilityai/stable-audio-open-1.0
# Hugging Face Hub
HF_TOKEN=your_huggingface_token
# Server configuration
HOST=0.0.0.0
PORT=7860
DEBUG=false
# GPU configuration
DEVICE=cuda
TORCH_DTYPE=float16# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Format code
black src/ tests/
isort src/ tests/
# Type checking
mypy src/music-agent/
βββ src/music_agent/ # Main package
β βββ agents/ # Agent implementations
β βββ tools/ # Music tool implementations
β βββ interfaces/ # UI and API interfaces
β βββ models/ # Model wrappers and utilities
β βββ utils/ # Utility functions
βββ tests/ # Test suite
βββ demos/ # Demo scripts and examples
βββ dev-tools/ # Development and testing tools
βββ scripts/ # Utility and setup scripts
βββ docker/ # Docker configuration
βββ examples/ # Usage examples
βββ docs/ # Documentation
βββ models/ # Downloaded model files
-
ChatMusician: Music understanding and analysis
- Model:
m-a-p/ChatMusician - Capabilities: Music theory, harmony analysis, composition guidance
- Model:
-
NotaGen: Symbolic music generation
- Model: Custom implementation with pre-trained weights
- Output: ABC notation format
-
Stable Audio Open: Audio generation
- Model:
stabilityai/stable-audio-open-1.0 - Output: High-quality 44.1kHz stereo audio
- Model:
-
Audio Analysis: Content understanding
- Multiple models for different analysis tasks
- Capabilities: Genre classification, mood detection, structure analysis
- Music Composition: Generate complete musical pieces
- Harmony Analysis: Analyze chord progressions and harmonic structure
- Style Transfer: Convert music between different styles
- Audio Synthesis: Convert symbolic music to audio
- Format Conversion: Between ABC, MIDI, MusicXML, and audio formats
- Music Education: Explain music theory concepts and analysis
We welcome contributions! Please see our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.
- smolagents - Agent framework
- ChatMusician - Music understanding
- NotaGen - Symbolic music generation
- Stable Audio - Audio generation
- Verovio - Music notation rendering
If you use this framework in your research, please cite:
@software{music_agent_framework,
title={Music Agent Framework: Comprehensive Music AI with smolagents},
author={Music Agent Team},
year={2025},
url={https://github.com/music-agent/music-agent}
}