You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Add comprehensive documentation to the entire repository
This commit introduces a complete set of documentation for the project,
fulfilling the goal of making the repository accessible and understandable
for new developers.
The following changes have been made:
1. **Full Docstring Coverage**:
- Added Google-style Python docstrings to all public functions,
methods, and classes in the core application files (`app.py`,
`memory.py`, `moondream_detector.py`, `web_server.py`).
- Documented all agent tools (`ansible_tool.py`,
`code_runner_tool.py`, `mcp_tool.py`, `power_tool.py`,
`ssh_tool.py`, `summarizer_tool.py`, `web_browser_tool.py`).
- Documented the Python power management service (`power_agent.py`).
- Added C-style documentation comments to the eBPF traffic
monitoring program (`traffic_monitor.c`).
2. **README.md Overhaul**:
- The main `README.md` has been significantly updated to serve as a
complete guide.
- Expanded the "Agent Architecture" section to detail the `TwinService`.
- Updated the "Tool Use" section with a complete and detailed list of
all available agent tools and their functions.
- Added a new "Advanced System Features" section to explain the
eBPF-based power management system and the Mission Control Web UI.
- Restructured headings for better clarity and flow.
Copy file name to clipboardExpand all lines: README.md
+50-37Lines changed: 50 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,10 +59,39 @@ If you are setting up a multi-node cluster, you will need to work with the Ansib
59
59
- **`--ask-become-pass`**: This flag is important. It will prompt you for your `sudo` password, which Ansible needs to perform administrative tasks.
60
60
- **What this does:** This playbook not only aconfigures the cluster services (Consul, Nomad, etc.) but also automatically bootstraps the primary control node into a fully autonomous AI agent by deploying the necessary AI services.
61
61
62
-
## 4. AI Service Deployment
62
+
## 5. Agent Architecture: The `TwinService`
63
+
The core of this application is the `TwinService`, a custom service that acts as the agent's "brain." It orchestrates the agent's responses, memory, and tool use.
64
+
65
+
### 5.1. Memory
66
+
- **Short-Term:** Remembers the last 10 conversational turns in a simple list.
67
+
- **Long-Term:** Uses a FAISS vector store (`long_term_memory.faiss`) to remember key facts. It performs a semantic search over this memory to retrieve relevant context for new conversations.
68
+
69
+
### 5.2. Tool Use
70
+
The agent can use tools to perform actions and gather information. The `TwinService` dynamically provides the list of available tools to the LLM in its prompt, enabling the LLM to decide which tool to use based on the user's query.
71
+
72
+
#### Available Tools:
73
+
- **Vision (`vision.get_observation`)**: Gets a real-time description of what is visible in the webcam, powered by the Moondream2 model.
74
+
- **Master Control Program (`mcp.*`)**: Provides agent introspection and self-control, allowing it to check pipeline status (`get_status`) and manage memory (`get_memory_summary`, `clear_short_term_memory`).
75
+
- **SSH (`ssh.run_command`)**: Executes a command on a remote machine via SSH, using key-based or password authentication.
76
+
- **Code Runner (`code_runner.run_python_code`)**: Executes Python code in a secure, sandboxed Docker container.
77
+
- **Ansible (`ansible.run_playbook`)**: Runs an Ansible playbook to configure and manage the cluster.
78
+
- **Web Browser (`web_browser.*`)**: Provides a full web browser (via Playwright) for navigating (`goto`), reading (`get_page_content`), and interacting with websites (`click`, `type`).
79
+
- **Power (`power.set_idle_threshold`)**: Controls the cluster's power management policies by setting service idle thresholds.
80
+
- **Summarizer (`summarizer.get_summary`)**: Performs an extractive summary of the conversation to find the most relevant points related to a query.
81
+
82
+
### 5.3. Mixture of Experts (MoE) Routing
83
+
The agent is designed to functionas a "Mixture of Experts." The primary `pipecat` agent acts as a router, classifying the user's query and routing it to a specialized backend expert if appropriate.
84
+
85
+
- **How it Works:** The `TwinService` prompt instructs the main agent to first classify the user's query. If it determines the query is best handled by a specialist (e.g., a 'coding' expert), it uses the `route_to_expert` tool. This tool call is intercepted by the `TwinService`, which then discovers the expert's API endpoint via Consul and forwards the query.
86
+
- **Configuration:** Deploying these specialized experts is done using the `deploy_expert.yaml` Ansible playbook. For detailed instructions, see the [Advanced AI Service Deployment](#6.2-advanced-deploying-additional-ai-experts) section below.
87
+
88
+
### 5.4. Configuring Agent Personas
89
+
The personality and instructions for the main router agent and each expert agent are defined in simple text files located in the `ansible/roles/pipecatapp/files/prompts/` directory. You can edit these files to customize the behavior of each agent. For example, you can edit `coding_expert.txt` to give it a different programming specialty.
90
+
91
+
## 6. AI Service Deployment
63
92
The system is designed to be self-bootstrapping. The `bootstrap.sh` script (or the main `playbook.yaml`) handles the deployment of the core AI services on the primary control node. This includes a default instance of the `prima-expert` job and the `pipecat` voice agent.
64
93
65
-
### 4.1. Starting and Stopping Services
94
+
### 6.1. Starting and Stopping Services
66
95
Use the provided script to submit the core AI jobs to Nomad:
### 4.2. Advanced: Deploying Additional AI Experts
106
+
### 6.2. Advanced: Deploying Additional AI Experts
78
107
The true power of this architecture is the ability to deploy multiple, specialized AI experts that the main `pipecat` agent can route queries to. With the new unified `prima-expert.nomad` job template, deploying a new expert is handled through a dedicated Ansible playbook.
79
108
80
109
1. **Define a Model List for Your Expert:**
@@ -95,60 +124,44 @@ The true power of this architecture is the ability to deploy multiple, specializ
95
124
96
125
The `TwinService`in the `pipecatapp` will automatically discover any service registered in Consul with the `prima-api-` prefix and make it available for routing.
97
126
98
-
## 5. Agent Architecture: The `TwinService`
99
-
The core of this application is the `TwinService`, a custom service that acts as the agent's "brain." It orchestrates the agent's responses, memory, and tool use.
100
-
101
-
### 5.1. Memory
102
-
- **Short-Term:** Remembers the last 10 conversational turns.
103
-
- **Long-Term:** Uses a FAISS vector store (`long_term_memory.faiss`) to remember key facts. It performs a semantic search over this memory to retrieve relevant context for new conversations.
104
-
105
-
### 5.2. Tool Use
106
-
The agent can use tools to perform actions and gather information. The `TwinService` dynamically provides the list of available tools to the LLM in its prompt, enabling the LLM to decide which tool to use based on the user's query.
127
+
## 7. Advanced System Features
107
128
108
-
#### Available Tools:
109
-
- **Vision (`vision.get_observation`)**
110
-
- **Master Control Program (`mcp.get_status`, etc.)**
111
-
- **SSH (`ssh.run_command`)**
112
-
- **Code Runner (`code_runner.run_python_code`)**
113
-
- **Ansible (`ansible.run_playbook`)**
114
-
- **Web Browser (`web_browser.goto`, etc.)**
115
-
116
-
### 5.3. Mixture of Experts (MoE) Routing
117
-
The agent is designed to functionas a "Mixture of Experts." The primary `pipecat` agent acts as a router, classifying the user's query and routing it to a specialized backend expert if appropriate.
118
-
119
-
- **How it Works:** The `TwinService` prompt instructs the main agent to first classify the user's query. If it determines the query is best handled by a specialist (e.g., a 'coding' expert), it uses the `route_to_expert` tool. This tool call is intercepted by the `TwinService`, which then forwards the query to the appropriate expert's API endpoint.
120
-
- **Configuration:** Deploying these specialized experts is done using the `deploy_expert.yaml` Ansible playbook. For detailed instructions, see the **[Deploying Additional AI Experts](#42-advanced-deploying-additional-ai-experts)** section above.
121
-
122
-
### 5.4. Configuring Agent Personas
123
-
The personality and instructions for the main router agent and each expert agent are defined in simple text files located in the `ansible/roles/pipecatapp/files/prompts/` directory. You can edit these files to customize the behavior of each agent. For example, you can edit `coding_expert.txt` to give it a different programming specialty.
129
+
### 7.1. Power Management
130
+
To optimize resource usage on legacy hardware, this project includes an intelligent power management system.
131
+
- **How it Works:** A Python service, `power_agent.py`, uses an eBPF program (`traffic_monitor.c`) to monitor network traffic to specific services at the kernel level with minimal overhead.
132
+
- **Sleep/Wake:** If a monitored service is idle for a configurable period, the power agent automatically stops the corresponding Nomad job. When new traffic is detected, the agent restarts the job.
133
+
- **Configuration:** The agent can configure this behavior using the `power.set_idle_threshold` tool.
124
134
125
-
## 6. Mission Control Web UI
126
-
This project includes a web-based dashboard for real-time display and debugging. To access it, navigate to the IP address of any node in your cluster on port 8000 (e.g., `http://192.168.1.101:8000`).
135
+
### 7.2. Mission Control Web UI
136
+
This project includes a web-based dashboard forreal-time display and debugging. To access it, navigate to the IP address of any nodein your cluster on port 8000 (e.g., `http://192.168.1.101:8000`). The UI provides:
137
+
- Real-time conversation logs.
138
+
- A request-approval interface for sensitive tool actions.
139
+
- The ability to save and load the agent's memory state.
127
140
128
-
## 7. Testing and Verification
141
+
## 8. Testing and Verification
129
142
- **Check Cluster Status:** `nomad node status`
130
143
- **Check Job Status:** `nomad job status`
131
144
- **View Logs:** `nomad alloc logs <allocation_id>` or use the Mission Control Web UI.
132
145
- **Manual Test Scripts:** A set of scripts for manual testing of individual components is available in the `testing/` directory.
133
146
134
-
## 8. Performance Tuning & Service Selection
147
+
## 9. Performance Tuning & Service Selection
135
148
- **Model Selection:** The `prima-expert.nomad` job is configured via Ansible variables in `group_vars/models.yaml`. You can define different model lists for different experts.
136
149
- **Network:** Wired gigabit ethernet is strongly recommended over Wi-Fi for reduced latency.
137
150
- **VAD Tuning:** The `RealtimeSTT` sensitivity can be tuned in `app.py` for better performance in noisy environments.
138
151
- **STT/TTS Service Selection:** You can choose which Speech-to-Text and Text-to-Speech services to use by setting environment variables in the `pipecatapp.nomad` job file.
139
152
140
-
## 9. Benchmarking
153
+
## 10. Benchmarking
141
154
This project includes two types of benchmarks.
142
155
143
-
### 9.1. Real-Time Latency Benchmark
156
+
### 10.1. Real-Time Latency Benchmark
144
157
Measures the end-to-end latency of a live conversation. Enable it by setting `BENCHMARK_MODE = "true"` in the `env` section of the `pipecatapp.nomad` job file. Results are printed to the job logs.
145
158
146
-
### 9.2. Standardized Performance Benchmark
159
+
### 10.2. Standardized Performance Benchmark
147
160
Uses `llama-bench` to measure the raw inference speed (tokens/sec) of the deployed LLM backend. Run the `benchmark.nomad` job to test the performance of the default model.
148
161
```bash
149
162
nomad job run /opt/nomad/jobs/benchmark.nomad
150
163
```
151
164
View results in the job logs: `nomad job logs llama-benchmark`
152
165
153
-
## 10. Advanced Development: Prompt Evolution
154
-
For advanced users, this project includes a workflow for automatically improving the agent's core prompt using evolutionary algorithms. See `prompt_engineering/PROMPT_ENGINEERING.md`for details.
166
+
## 11. Advanced Development: Prompt Evolution
167
+
For advanced users, this project includes a workflow for automatically improving the agent's core prompt using evolutionary algorithms. See `prompt_engineering/PROMPT_ENGINEERING.md`for details.
0 commit comments