Incident Buddy is a CLI tool for generating vendor-specific observability queries to assist incident responders during an investigation.
By leveraging LLMs, it creates contextually relevant queries for multiple observability platforms, saving time and cognitive load during high-pressure situations.
- Generate contextually relevant queries for multiple observability platforms
- Interactive CLI for incident data collection
- Integration with LLMs (currently the only supported provider is Ollama)
- Support for New Relic, Honeycomb, Sumo Logic, DataDog, and AWS CloudWatch
- Python 3.12+
- Ollama installed locally with your preferred model
For running the CLI using Docker or Docker Compose, please refer to the Docker guide for more information.
Please refer to the contributing guide for more information.
To use Incident Buddy, you can run it as a module:
# Show available commands
python -m incident_buddy.main --help
# Create an incident investigation
python -m incident_buddy.main incident createIf you want to try it with a test incident example, you can run:
python -m incident_buddy.main incident testYou can also use it after installing with pip — don't forget to set the needed environment variables:
# If installed via pip
incident-buddy --help
incident-buddy incident createHere are some optimizations and features I would love to implement in the future:
- Support more observability platforms (e.g., Prometheus, Tempo, etc.).
- Support more LLM providers (OpenAI, Anthropic, etc.).
- Enrich prompt templates and implement few-shot prompting to improve accuracy.
- Allow injecting additional/custom context and metadata related to the incident to the prompts.
- Implement retries within the LLM client.
- Add database support (e.g., using Redis) to update incident details and re-generate queries as the incident investigation process moves forward and more details become available.
