|
1 | 1 | # Distributed Conversational AI Pipeline for Legacy CPU Clusters |
2 | 2 |
|
3 | | -Last updated: 2026-01-23 |
| 3 | +Last updated: 2026-02-01 |
4 | 4 |
|
5 | 5 | It uses Ansible for automated provisioning, Nomad for cluster orchestration, and a state-of-the-art AI stack to create a responsive, streaming, and embodied voice agent. For a detailed technical description of the system's layers, see the [Holistic Project Architecture](docs/ARCHITECTURE.md) document. |
6 | 6 |
|
@@ -75,12 +75,21 @@ For development, testing, or bootstrapping the very first node of a new cluster, |
75 | 75 | - `controller`: Sets up only the core infrastructure services (Consul, Nomad, etc.). |
76 | 76 | - `worker`: Configures the node as a worker and requires `--controller-ip`. |
77 | 77 | - `--controller-ip <ip>`: The IP address of the main controller node. **Required** when `--role` is `worker`. |
| 78 | + - `--user <user>`: Specify the target user for Ansible (default: `pipecatapp`). |
78 | 79 | - `--tags <tag1,tag2>`: Run only specific parts of the Ansible playbook (e.g., `--tags nomad` would only run the Nomad configuration tasks). |
79 | 80 | - `--external-model-server`: Skips the download and build steps for large language models. This is ideal for development or if you are using a remote model server. |
80 | 81 | - `--purge-jobs`: Stops and purges all running Nomad jobs before starting the bootstrap process, ensuring a clean deployment. |
| 82 | + - `--leave-services-running`: Do not clean up Nomad and Consul data on startup (useful for restarts without state loss). |
81 | 83 | - `--clean`: **Use with caution.** This will permanently delete all untracked files in the repository (`git clean -fdx`), restoring it to a pristine state. |
82 | 84 | - `--debug`: Enables verbose Ansible logging (`-vvvv`) and saves the full output to `playbook_output.log`. |
| 85 | + - `--verbose [level]`: Set verbosity level (0-4). Default 0, or 3 if flag is used without value. |
83 | 86 | - `--continue`: If a previous bootstrap run failed, this flag will resume the process from the last successfully completed playbook, saving significant time. |
| 87 | + - `--benchmark`: Run benchmark tests during deployment. |
| 88 | + - `--deploy-docker`: Deploy the pipecat application using Docker (Default). |
| 89 | + - `--run-local`: Deploy the pipecat application using local `raw_exec` (useful for debugging without Docker rebuilds). |
| 90 | + - `--container`: Run the entire infrastructure inside a single large container (experimental). |
| 91 | + - `--home-assistant-debug`: Enable debug mode for Home Assistant integrations. |
| 92 | + - `--watch <target>`: Pause for inspection after the specified target (task/role) completes. |
84 | 93 |
|
85 | 94 | This single node is now ready to be used as a standalone conversational AI agent. It can also serve as the primary "seed" node for a larger cluster. To expand your cluster, see the advanced guide below. |
86 | 95 |
|
@@ -157,26 +166,32 @@ The following tools are available in the codebase (`pipecatapp/tools/`): |
157 | 166 | - **Archivist (`archivist`)**: Performs deep research on the agent's long-term memory. |
158 | 167 | - **Claude Clone (`claude_clone`)**: A tool for interacting with a Claude-like model. |
159 | 168 | - **Code Runner (`code_runner`)**: Executes Python code in a secure, sandboxed environment. |
| 169 | +- **Container Registry (`container_registry`)**: Search for container images and tags in the Docker Registry. |
160 | 170 | - **Council (`council`)**: Convenes a council of AI experts to deliberate on a query. |
161 | 171 | - **Dependency Scanner (`dependency_scanner`)**: Scans Python packages for vulnerabilities using the OSV database. |
162 | 172 | - **Desktop Control (`desktop_control`)**: Provides full control over the desktop environment (screenshots, mouse/keyboard). |
| 173 | +- **Experiment (`experiment`)**: Orchestrates A/B testing or parallel experiments for code generation. |
163 | 174 | - **File Editor (`file_editor`)**: Reads, writes, and patches files in the codebase. |
164 | 175 | - **Final Answer (`final_answer`)**: A tool to provide a final answer to the user. |
165 | 176 | - **Git (`git`)**: Interacts with Git repositories. |
166 | 177 | - **Home Assistant (`ha`)**: Controls smart home devices via Home Assistant. |
167 | 178 | - **LLxprt Code (`llxprt_code`)**: A specialized tool for code-related tasks. |
168 | 179 | - **Master Control Program (`mcp`)**: Provides agent introspection and self-control. |
169 | 180 | - **Open Workers (`open_workers`)**: Manages and interacts with open worker agents. |
| 181 | +- **OpenClaw (`openclaw`)**: Send messages via OpenClaw Gateway to various channels (WhatsApp, Telegram, etc.). |
170 | 182 | - **OpenCode (`opencode`)**: Interface for the OpenCode tool. |
171 | 183 | - **Orchestrator (`orchestrator`)**: Dispatches high-priority, complex jobs to the cluster. |
172 | 184 | - **Planner (`planner`)**: Plans complex tasks and executes them. |
173 | 185 | - **Power (`power`)**: Controls the cluster's power management policies. |
174 | 186 | - **Project Mapper (`project_mapper`)**: Scans the codebase to generate a project structure map. |
175 | 187 | - **Prompt Improver (`prompt_improver`)**: A tool for improving prompts. |
176 | 188 | - **RAG (`rag`)**: Searches the project's documentation to answer questions. |
| 189 | +- **Search (`search`)**: Search the codebase for text patterns or file names. |
177 | 190 | - **Shell (`shell`)**: Executes shell commands (uses a persistent tmux session). |
178 | 191 | - **Smol Agent (`smol_agent_computer`)**: A tool for creating small, specialized agents. |
| 192 | +- **Spec Loader (`spec_loader`)**: Clones external Git repositories (docs, specs) and ingests them into the agent's context. |
179 | 193 | - **SSH (`ssh`)**: Executes commands on remote machines. |
| 194 | +- **Submit Solution (`submit_solution`)**: Allows a Worker Agent to submit a code solution or artifact to be parsed by the ExperimentTool/Judge. |
180 | 195 | - **Summarizer (`summarizer`)**: Summarizes conversation history. |
181 | 196 | - **Swarm (`swarm`)**: Spawns multiple worker agents to perform tasks in parallel. |
182 | 197 | - **Term Everything (`term_everything`)**: Provides a terminal interface for interacting with the system. |
@@ -312,6 +327,13 @@ The true power of this architecture is the ability to deploy multiple, specializ |
312 | 327 |
|
313 | 328 | The `TwinService` in the `pipecatapp` will automatically discover any service registered in Consul with the `llama-api-` prefix and make it available for routing. |
314 | 329 |
|
| 330 | +### 8.3. Distributed Split Inference |
| 331 | + |
| 332 | +To support running large models on legacy hardware with limited RAM, the system supports **Split Inference**. |
| 333 | + |
| 334 | +- **How it Works:** The `expert` job can be configured to offload computation to `rpc-server` providers running on worker nodes. |
| 335 | +- **Configuration:** When deploying an expert, the system automatically discovers available `rpc-provider` services via Consul and passes them to the `llama-server` using the `--rpc` argument. This allows the model's layers to be split across multiple machines, aggregating their memory and compute power. |
| 336 | +
|
315 | 337 | ## 9. Advanced System Features |
316 | 338 |
|
317 | 339 | ### 9.1. Power Management |
|
0 commit comments