Dockerized GenAI stack — local LLM inference, stateful AI agents, and workflow automation.
The GenAI stack (Ollama · Letta · n8n) brings local AI capabilities to your IoT deployment. Ollama runs open-weight LLMs entirely on-device, Letta provides persistent AI agents with long-term memory, and n8n wires everything together with event-driven workflows.
Attaches to the shared p4n4-net Docker bridge network created by p4n4-iot, enabling seamless integration with MQTT, InfluxDB, Node-RED, and Grafana.
Part of the p4n4 platform — an EdgeAI + GenAI integration platform for IoT deployments.
- Architecture
- Stack Components
- Prerequisites
- Getting Started
- Project Structure
- Ollama Models
- n8n Workflows
- GPU Support
- Usage
- Default Ports
- Default Credentials
- Network Requirements
- Security Hardening
- Local Overrides
- Integration with p4n4-iot
- Resources
- License
[p4n4-iot / MQTT / InfluxDB]
│
│ (shared p4n4-net bridge)
▼
[n8n] ← event-driven workflow automation
/ \
▼ ▼
[Ollama] [Letta] ← local LLM runtime + stateful AI agents
Data flow: n8n subscribes to MQTT topics (via p4n4-net) and triggers AI workflows. Ollama serves local LLM inference, Letta manages persistent AI agents with memory, and n8n routes results back to MQTT, InfluxDB, or external webhooks.
| Service | Role | Description |
|---|---|---|
| Ollama | Local LLM Runtime | Runs open-weight models (Llama, Mistral, Phi, etc.) entirely on-device with zero data egress. Exposes an OpenAI-compatible REST API on port 11434. |
| Letta | AI Agent Framework | Stateful AI agent framework with persistent memory (formerly MemGPT). Build agents that remember context across sessions and reason over long-term IoT event histories. |
| n8n | Workflow Automation | Low-code, node-based workflow engine. Connects MQTT, InfluxDB, Ollama, Letta, and external APIs without custom glue code. Includes four starter IoT + AI workflows. |
- Docker (v20.10+)
- Docker Compose (v2.0+)
- At least 8 GB RAM available to Docker (16 GB recommended for larger models)
p4n4-iotrunning (orp4n4-netnetwork created manually — see Network Requirements)- (Optional) NVIDIA GPU with drivers + NVIDIA Container Toolkit for GPU acceleration
-
Clone the repository
git clone https://github.com/raisga/p4n4-ai.git cd p4n4-ai -
Configure environment variables
cp .env.example .env # Edit .env — at minimum change N8N_ENCRYPTION_KEY and passwords -
Ensure
p4n4-netexists (skip if p4n4-iot is already running)docker network create p4n4-net
-
Start the stack
docker compose up -d # or make up -
Pull a language model
make pull-models # or pull specific models: ./scripts/pull-models.sh llama3.2 nomic-embed-text -
Open the interfaces
- n8n: http://localhost:5678
- Letta: http://localhost:8283
- Ollama API: http://localhost:11434
p4n4-ai/
├── docker-compose.yml # GenAI stack service definitions
├── docker-compose.override.yml.example # Local override template (GPU, dev)
├── Makefile # Convenience commands
├── .env.example # Environment template (copy to .env)
├── .gitignore
├── config/
│ ├── ollama/ # Ollama config (models pulled at runtime)
│ ├── letta/
│ │ └── letta.conf # Letta server configuration reference
│ └── n8n/
│ └── workflows/
│ ├── alert-enrichment.json # Enrich MQTT alerts with LLM analysis
│ ├── scheduled-digest.json # Hourly telemetry summary via Ollama
│ ├── device-onboarding.json # Auto-register new MQTT devices
│ └── incident-escalation.json # Classify and escalate critical alerts
└── scripts/
├── pull-models.sh # Helper to pull models into Ollama
├── selector.sh # Interactive service selector
└── check_env_example.py # CI: .env.example completeness check
Models are not bundled in the image — pull them after starting the stack.
# Pull the default model (llama3.2)
make pull-models
# Pull specific models
./scripts/pull-models.sh llama3.2
./scripts/pull-models.sh llama3.2 nomic-embed-text phi3.5
# Pull directly via Docker
docker exec p4n4-ollama ollama pull llama3.2| Model | Size | Use Case |
|---|---|---|
llama3.2 |
2 GB | General inference, alert analysis, summaries |
phi3.5 |
2.2 GB | Lightweight reasoning, classification |
nomic-embed-text |
274 MB | Embeddings for Letta agent memory |
llama3.2:70b |
40 GB | High-quality reasoning (requires GPU) |
make models
# or
docker exec p4n4-ollama ollama listFour starter workflows are included in n8n/workflows/. Import them via the n8n UI:
- Open n8n at http://localhost:5678
- Go to Workflows → Import from File
- Select the JSON file from
config/n8n/workflows/
| Workflow | Description |
|---|---|
alert-enrichment.json |
Subscribes to inference/results MQTT topic; sends low-confidence results to Ollama for analysis |
scheduled-digest.json |
Runs hourly; queries InfluxDB for recent telemetry and generates a natural-language summary via Ollama |
device-onboarding.json |
Listens on devices/+/register; auto-registers new devices and publishes a confirmation to MQTT |
incident-escalation.json |
Listens on alerts/+/critical; classifies severity via Ollama and publishes enriched alert to alerts/escalated |
After importing workflows, configure an MQTT credential named p4n4 MQTT:
- Host:
p4n4-mqtt(service name on p4n4-net) - Port:
1883 - Username/Password: match values in your p4n4-iot
.env
To enable NVIDIA GPU acceleration for Ollama, use the override file:
cp docker-compose.override.yml.example docker-compose.override.yml
# Uncomment the 'ollama' GPU section
docker compose up -dVerify GPU detection:
docker exec p4n4-ollama nvidia-smiFor AMD (ROCm), uncomment the ollama:rocm override section instead.
make help # Show all available commands
make up # Start the full stack
make down # Stop all services
make restart # Restart all services
make logs # Follow logs from all services
make ps # Show service status
make status # Colorized status table
make start SERVICE=n8n # Start a single service
make stop SERVICE=letta # Stop a single service
make models # List Ollama models
make pull-models # Pull default models
make test-ollama # Send a test prompt to Ollama
make clean # Stop services and remove all data volumes# Via make
make test-ollama
# Via curl (from host)
curl http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model":"llama3.2","prompt":"Hello!","stream":false}'
# From another container on p4n4-net
docker run --rm --network p4n4-net curlimages/curl \
curl -s http://p4n4-ollama:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model":"llama3.2","prompt":"Hello!","stream":false}'| Service | Port | URL |
|---|---|---|
| Ollama API | 11434 |
http://localhost:11434 |
| Letta Server | 8283 |
http://localhost:8283 |
| n8n UI | 5678 |
http://localhost:5678 |
All credentials are set in .env. Defaults from .env.example:
| Service | Username | Password |
|---|---|---|
| n8n | admin |
adminpassword |
| Letta | (no username) | lettapassword |
Note: Change all passwords and the N8N_ENCRYPTION_KEY before deploying to production.
This stack attaches to p4n4-net as an external network. The network must exist before running docker compose up.
Option 1 — Use p4n4-iot (recommended):
# In p4n4-iot directory
docker compose up -d
# Then start p4n4-aiOption 2 — Create network manually:
docker network create p4n4-net
docker compose up -dOption 3 — Use the CLI:
p4n4 up --all # starts all stacks in the correct order-
Change all default credentials in
.envbefore exposing services externally. -
Set a strong
N8N_ENCRYPTION_KEY— this encrypts stored credentials in n8n. Minimum 32 characters. -
Letta API password — set
LETTA_SERVER_PASSWORDto a strong value. All API calls require this password as a bearer token. -
Restrict port exposure — for production, remove host-port bindings from
docker-compose.ymland access services only viap4n4-netor a reverse proxy. -
Ollama access — Ollama has no built-in authentication. Restrict access at the network or reverse-proxy level for production deployments.
Use docker-compose.override.yml for machine-specific settings (GPU, external hostnames, custom volumes):
cp docker-compose.override.yml.example docker-compose.override.yml
# Edit docker-compose.override.yml as needed
docker compose up -dThe override file is listed in .gitignore and will never be committed.
When running alongside p4n4-iot on the same p4n4-net network, services can be referenced by their container names:
| p4n4-iot Service | Address from p4n4-ai |
|---|---|
| MQTT Broker | p4n4-mqtt:1883 |
| InfluxDB | p4n4-influxdb:8086 |
| Node-RED | p4n4-node-red:1880 |
Use these addresses in n8n workflow nodes, Letta agent configurations, and Ollama-powered scripts.
Shared secrets (must match between stacks — set identical values in both .env files):
| Variable | Purpose |
|---|---|
INFLUXDB_TOKEN |
InfluxDB API token |
INFLUXDB_ORG |
InfluxDB organization |
INFLUXDB_BUCKET |
Primary InfluxDB bucket |
- p4n4 Platform — umbrella repo and architecture docs
- p4n4-iot — IoT stack (MING)
- Ollama Documentation — available models and API reference
- Letta Documentation — agent framework docs
- n8n Documentation — workflow automation docs
This project is licensed under the MIT License.