Canonical reference for all Osaurus features, their status, and documentation.
This file is the source of truth. When adding or modifying features, update this inventory to keep documentation in sync.
| Feature | Status | README Section | Documentation | Code Location |
|---|---|---|---|---|
| Local LLM Server (MLX) | Stable | "Key Features" | OpenAI_API_GUIDE.md | Services/MLXService.swift, Services/ModelRuntime/ |
| Remote Providers | Stable | "Key Features" | REMOTE_PROVIDERS.md | Services/RemoteProviderManager.swift, Services/RemoteProviderService.swift |
| Remote MCP Providers | Stable | "Key Features" | REMOTE_MCP_PROVIDERS.md | Services/MCPProviderManager.swift, Tools/MCPProviderTool.swift |
| MCP Server | Stable | "MCP Server" | (in README) | Networking/OsaurusServer.swift, Services/MCPServerManager.swift |
| Tools & Plugins | Stable | "Tools & Plugins" | PLUGIN_AUTHORING.md | Tools/, Managers/PluginManager.swift, Services/PluginHostAPI.swift, Storage/PluginDatabase.swift, Models/PluginHTTP.swift, Views/PluginConfigView.swift |
| Skills | Stable | "Skills" | SKILLS.md | Managers/SkillManager.swift, Views/SkillsView.swift, Services/CapabilityService.swift |
| Memory | Stable | "Key Features" | MEMORY.md | Services/MemoryService.swift, Services/MemorySearchService.swift, Services/MemoryContextAssembler.swift |
| Agents | Stable | "Agents" | (in README) | Managers/AgentManager.swift, Models/Agent.swift, Views/AgentsView.swift |
| Schedules | Stable | "Schedules" | (in README) | Managers/ScheduleManager.swift, Models/Schedule.swift, Views/SchedulesView.swift |
| Watchers | Stable | "Watchers" | WATCHERS.md | Managers/WatcherManager.swift, Models/Watcher.swift, Views/WatchersView.swift |
| Agents | Stable | "Agents" | WORK.md | Agent/, Services/WorkEngine.swift, Views/WorkView.swift |
| Developer Tools: Insights | Stable | "Developer Tools" | DEVELOPER_TOOLS.md | Views/InsightsView.swift, Services/InsightsService.swift |
| Developer Tools: Server Explorer | Stable | "Developer Tools" | DEVELOPER_TOOLS.md | Views/ServerView.swift |
| Apple Foundation Models | macOS 26+ | "What is Osaurus?" | (in README) | Services/FoundationModelService.swift |
| Menu Bar Chat | Stable | "Highlights" | (in README) | Views/ChatView.swift, Views/ChatOverlayView.swift |
| Chat Session Management | Stable | "Highlights" | (in README) | Managers/ChatSessionsManager.swift, Models/ChatSessionData.swift |
| Custom Themes | Stable | "Highlights" | (in README) | Views/ThemesView.swift, Views/Components/ThemeEditorView.swift |
| Model Manager | Stable | "Highlights" | (in README) | Views/ModelDownloadView.swift, Services/HuggingFaceService.swift |
| Shared Configuration | Stable | - | SHARED_CONFIGURATION_GUIDE.md | Services/SharedConfigurationService.swift |
| OpenAI API Compatibility | Stable | "API Endpoints" | OpenAI_API_GUIDE.md | Networking/HTTPHandler.swift, Models/OpenAIAPI.swift |
| Anthropic API Compatibility | Stable | "API Endpoints" | (in README) | Networking/HTTPHandler.swift, Models/AnthropicAPI.swift |
| Open Responses API | Stable | "API Endpoints" | OpenAI_API_GUIDE.md | Networking/HTTPHandler.swift, Models/OpenResponsesAPI.swift |
| Ollama API Compatibility | Stable | "API Endpoints" | (in README) | Networking/HTTPHandler.swift |
| Voice Input (FluidAudio) | Stable | "Voice Input" | VOICE_INPUT.md | Services/SpeechService.swift, Managers/SpeechModelManager.swift |
| VAD Mode | Stable | "Voice Input" | VOICE_INPUT.md | Services/VADService.swift, Views/ContentView.swift (VAD controls) |
| Transcription Mode | Stable | "Voice Input" | VOICE_INPUT.md | Services/TranscriptionModeService.swift, Views/TranscriptionOverlayView.swift |
| CLI | Stable | "CLI Reference" | (in README) | Packages/OsaurusCLI/ |
┌─────────────────────────────────────────────────────────────────────────┐
│ Osaurus App │
├─────────────────────────────────────────────────────────────────────────┤
│ Views Layer │
│ ├── ContentView (Menu Bar) │
│ ├── ChatOverlayView (Global Hotkey Chat) │
│ ├── WorkView (Work Mode) │
│ ├── ManagementView │
│ │ ├── ModelDownloadView (Models) │
│ │ ├── RemoteProvidersView (Providers) │
│ │ ├── ToolsManagerView (Tools & Plugin Config) │
│ │ ├── AgentsView (Agents) │
│ │ ├── SkillsView (Skills) │
│ │ ├── MemoryView (Memory) │
│ │ ├── SchedulesView (Schedules) │
│ │ ├── WatchersView (Watchers) │
│ │ ├── ThemesView (Themes) │
│ │ ├── InsightsView (Developer: Insights) │
│ │ ├── ServerView (Developer: Server Explorer) │
│ │ ├── VoiceView (Voice Input & VAD Settings) │
│ │ └── ConfigurationView (Settings) │
├─────────────────────────────────────────────────────────────────────────┤
│ Services Layer │
│ ├── Inference │
│ │ ├── MLXService (Local MLX models) │
│ │ ├── FoundationModelService (Apple Foundation Models) │
│ │ ├── RemoteProviderManager (Remote OpenAI-compatible APIs) │
│ │ └── RemoteProviderService (Per-provider connection handling) │
│ ├── MCP │
│ │ ├── MCPServerManager (Osaurus as MCP server) │
│ │ └── MCPProviderManager (Remote MCP client connections) │
│ ├── Tools │
│ │ ├── ToolRegistry │
│ │ ├── PluginManager │
│ │ ├── PluginHostAPI (v2 host callbacks: config, db, log) │
│ │ ├── PluginDatabase (Sandboxed per-plugin SQLite) │
│ │ └── MCPProviderTool (Wrapped remote MCP tools) │
│ ├── Agents │
│ │ └── AgentManager (Agent lifecycle and active agent) │
│ ├── Skills │
│ │ ├── SkillManager (Skill CRUD and loading) │
│ │ ├── CapabilityService (Two-phase capability selection) │
│ │ └── GitHubSkillService (GitHub import) │
│ ├── Scheduling │
│ │ └── ScheduleManager (Schedule lifecycle and execution) │
│ ├── Watchers │
│ │ ├── WatcherManager (FSEvents monitoring and convergence loop) │
│ │ ├── WatcherStore (Watcher persistence) │
│ │ └── DirectoryFingerprint (Change detection via Merkle hashing) │
│ ├── Agents │
│ │ ├── WorkEngine (Task execution coordinator) │
│ │ ├── WorkExecutionEngine (Plan generation and execution) │
│ │ └── IssueManager (Issue lifecycle management) │
│ ├── Voice/Audio │
│ │ ├── SpeechService (Speech-to-text transcription) │
│ │ ├── SpeechModelManager (Parakeet model downloads) │
│ │ ├── VADService (Voice activity detection, wake-word) │
│ │ ├── TranscriptionModeService (Global dictation into any app) │
│ │ └── AudioInputManager (Microphone/system audio selection) │
│ ├── Memory │
│ │ ├── MemoryService (Conversation processing and extraction) │
│ │ ├── MemorySearchService (Hybrid BM25 + vector search) │
│ │ ├── MemoryContextAssembler (Context injection with budgets) │
│ │ └── MemoryDatabase (SQLite storage with migrations) │
│ └── Utilities │
│ ├── InsightsService (Request logging) │
│ ├── HuggingFaceService (Model downloads) │
│ └── SharedConfigurationService │
├─────────────────────────────────────────────────────────────────────────┤
│ Networking Layer │
│ ├── OsaurusServer (HTTP + MCP server) │
│ ├── Router (Request routing) │
│ └── HTTPHandler (OpenAI/Anthropic/Ollama API handlers) │
├─────────────────────────────────────────────────────────────────────────┤
│ CLI (OsaurusCLI Package) │
│ └── Commands: serve, stop, status, ui, list, show, run, mcp, tools (install, dev, ...), version │
└─────────────────────────────────────────────────────────────────────────┘
Purpose: Run language models locally with optimized Apple Silicon inference.
Components:
Services/MLXService.swift— MLX model loading and managementServices/ModelRuntime/— Generation engine, streaming, tool detectionServices/ModelService.swift— Model lifecycle management
Configuration:
- Model storage:
~/MLXModels(override withOSU_MODELS_DIR) - Default port:
1337(override withOSU_PORT)
Purpose: Connect to OpenAI-compatible APIs to access cloud models.
Components:
Models/RemoteProviderConfiguration.swift— Provider config modelServices/RemoteProviderManager.swift— Connection managementServices/RemoteProviderService.swift— Per-provider API clientServices/RemoteProviderKeychain.swift— Secure credential storageViews/RemoteProvidersView.swift— UI for managing providersViews/Components/RemoteProviderEditSheet.swift— Add/edit provider UI
Presets:
| Preset | Host | Default Port | Auth |
|---|---|---|---|
| Anthropic | api.anthropic.com | 443 (HTTPS) | API Key |
| OpenAI | api.openai.com | 443 (HTTPS) | API Key |
| xAI | api.x.ai | 443 (HTTPS) | API Key |
| OpenRouter | openrouter.ai | 443 (HTTPS) | API Key |
| Custom | (user-defined) | (user-defined) | Optional |
Purpose: Connect to external MCP servers and aggregate their tools.
Components:
Models/MCPProviderConfiguration.swift— Provider config modelServices/MCPProviderManager.swift— Connection and tool discoveryServices/MCPProviderKeychain.swift— Secure token storageTools/MCPProviderTool.swift— Wrapper for remote MCP tools
Features:
- Automatic tool discovery on connect
- Configurable discovery and execution timeouts
- Tool namespacing (prefixed with provider name)
- Streaming support (optional)
Purpose: Expose Osaurus tools to AI agents via Model Context Protocol.
Components:
Services/MCPServerManager.swift— MCP server lifecycleNetworking/OsaurusServer.swift— HTTP MCP endpointsTools/ToolRegistry.swift— Tool registration and lookup
Endpoints:
| Endpoint | Method | Description |
|---|---|---|
/mcp/health |
GET | Health check |
/mcp/tools |
GET | List available tools |
/mcp/call |
POST | Execute a tool |
Purpose: Built-in debugging and development utilities.
Components:
Services/InsightsService.swift— Request/response loggingViews/InsightsView.swift— Insights UI
Features:
- Real-time request logging
- Filter by method (GET/POST) and source (Chat UI/HTTP API)
- Aggregate stats: requests, success rate, avg latency, errors
- Inference metrics: tokens, speed, model, finish reason
Components:
Views/ServerView.swift— Server explorer UI
Features:
- Live server status
- Interactive endpoint catalog
- Test endpoints with editable payloads
- Formatted response viewer
Purpose: Provide Anthropic Messages API compatibility for Claude Code and other Anthropic SDK clients.
Components:
Models/AnthropicAPI.swift— Anthropic request/response modelsModels/ResponseWriters.swift— SSE streaming for Anthropic formatNetworking/HTTPHandler.swift—/messagesendpoint handler
Features:
- Full Messages API support (
/messagesendpoint) - Streaming and non-streaming responses
- Tool use (function calling) support
- Converts internally to OpenAI format for unified processing
Purpose: Provide Open Responses API compatibility for multi-provider interoperability.
Components:
Models/OpenResponsesAPI.swift— Open Responses request/response models and streaming eventsModels/ResponseWriters.swift— SSE streaming for Open Responses formatNetworking/HTTPHandler.swift—/responsesendpoint handlerServices/RemoteProviderService.swift— Remote Open Responses provider support
Features:
- Full Responses API support (
/responsesendpoint) - Streaming with semantic events (
response.output_text.delta,response.completed, etc.) - Non-streaming responses
- Tool/function calling support
- Input as simple string or structured items
- Instructions (system prompt) support
- Connect to remote Open Responses-compatible providers
Streaming Events:
| Event | Description |
|---|---|
response.created |
Response object created |
response.in_progress |
Generation started |
response.output_item.added |
New output item (message or function call) |
response.output_text.delta |
Text content delta |
response.output_text.done |
Text content completed |
response.function_call_arguments.delta |
Function arguments delta |
response.output_item.done |
Output item completed |
response.completed |
Response finished |
Purpose: Customize the chat interface appearance with custom color schemes and styling.
Components:
Views/ThemesView.swift— Theme gallery and managementViews/Components/ThemeEditorView.swift— Full theme editorModels/CustomTheme.swift— Theme data modelModels/ThemeConfigurationStore.swift— Theme persistenceTheme/Theme.swift— Theme protocol and built-in themes
Features:
- Built-in light and dark themes
- Create custom themes with full color customization
- Import/export themes as JSON files
- Live preview while editing
- Background options: solid, gradient, or image
Purpose: Create custom AI assistants with unique behaviors, capabilities, and visual styles.
Components:
Models/Agent.swift— Agent data model with export/import supportModels/AgentStore.swift— Agent persistence (JSON files)Managers/AgentManager.swift— Agent lifecycle and active agent managementViews/AgentsView.swift— Agent gallery and management UI
Features:
- Custom System Prompts — Define unique instructions for each agent
- Tool Configuration — Enable or disable specific tools per agent
- Visual Themes — Assign a custom theme that activates with the agent
- Generation Settings — Configure default model, temperature, and max tokens
- Import/Export — Share agents as JSON files for backup or sharing
- Live Switching — Click to activate a agent, theme updates automatically
Agent Properties:
| Property | Description |
|---|---|
name |
Display name (required) |
description |
Brief description of the agent |
systemPrompt |
Instructions prepended to all chats |
enabledTools |
Map of tool name → enabled/disabled |
enabledPlugins |
Map of plugin ID → enabled/disabled |
themeId |
Optional custom theme to apply |
defaultModel |
Optional model ID for this agent |
temperature |
Optional temperature override |
maxTokens |
Optional max tokens override |
Purpose: Automate recurring AI tasks that run at specified intervals.
Components:
Models/Schedule.swift— Schedule data model with frequency typesModels/ScheduleStore.swift— Schedule persistence (JSON files)Managers/ScheduleManager.swift— Schedule lifecycle, timer management, and executionViews/SchedulesView.swift— Schedule management UI
Features:
- Frequency Options — Once (specific date/time), Daily, Weekly, Monthly, Yearly
- Agent Integration — Optionally assign a agent to handle the scheduled task
- Custom Instructions — Define the prompt sent to the AI when the schedule runs
- Enable/Disable — Toggle schedules on or off without deleting
- Manual Trigger — "Run Now" option to execute a schedule immediately
- Results Tracking — Links to the chat session from the last run
- Next Run Display — Shows when the schedule will next execute
- Timezone Aware — Automatically adjusts for system timezone changes
Schedule Properties:
| Property | Description |
|---|---|
name |
Display name (required) |
instructions |
Prompt sent to the AI when the schedule runs |
agentId |
Optional agent to use for the chat |
frequency |
When and how often to run |
isEnabled |
Whether the schedule is active |
lastRunAt |
When the schedule last ran |
lastChatSessionId |
Chat session ID from the last run |
Frequency Types:
| Type | Description | Example |
|---|---|---|
| Once | Run once at a specific date and time | "Jan 15, 2025 at 9:00 AM" |
| Daily | Run every day at a specific time | "Daily at 8:00 AM" |
| Weekly | Run on a specific day each week | "Every Monday at 9:00 AM" |
| Monthly | Run on a specific day each month | "Monthly on the 1st at 10:00 AM" |
| Yearly | Run on a specific date each year | "Yearly on Jan 1st at 12:00 PM" |
Purpose: Monitor folders for file system changes and automatically trigger AI agent tasks.
Components:
Models/Watcher.swift— Watcher data modelModels/WatcherStore.swift— Watcher persistence (JSON files)Managers/WatcherManager.swift— FSEvents monitoring, debouncing, and convergence loopServices/DirectoryFingerprint.swift— Merkle hash-based change detectionViews/WatchersView.swift— Watcher management UI
Features:
- Folder Monitoring — Watch any directory using FSEvents with a single shared stream
- Configurable Responsiveness — Fast (~200ms), Balanced (~1s), or Patient (~3s) debounce
- Recursive Monitoring — Optionally monitor subdirectories
- Agent Integration — Assign a agent to handle triggered tasks
- Enable/Disable — Toggle watchers on or off without deleting
- Manual Trigger — "Trigger Now" option to run a watcher immediately
- Convergence Loop — Re-checks directory fingerprint after agent completes; loops until stable (max 5 iterations)
- Smart Exclusion — Automatically excludes nested watched folders to prevent conflicts
Watcher Properties:
| Property | Description |
|---|---|
name |
Display name (required) |
instructions |
Prompt sent to the AI when changes are detected |
watchedFolder |
Directory to monitor (security-scoped bookmark) |
agentId |
Optional agent to use for the task |
isEnabled |
Whether the watcher is active |
recursive |
Whether to monitor subdirectories (default: false) |
responsiveness |
Debounce timing: fast, balanced, or patient |
lastTriggeredAt |
When the watcher last ran |
lastChatSessionId |
Chat session ID from the last run |
Responsiveness Options:
| Option | Debounce Window | Best For |
|---|---|---|
| Fast | ~200ms | Screenshots, single-file drops |
| Balanced | ~1s | General use (default) |
| Patient | ~3s | Downloads, batch operations |
Change Detection:
- FSEvents detects file system events across all enabled watchers
- Directory fingerprinting uses a Merkle hash of file metadata (path + size + modification time)
- Only stat() calls are used — no file content is read during detection
- Convergence loop ensures the agent doesn't run unnecessarily after self-caused changes
State Machine:
| State | Description |
|---|---|
idle |
Waiting for file system changes |
debouncing |
Coalescing rapid events within the debounce window |
processing |
Agent task is running |
settling |
Waiting for self-caused FSEvents to flush |
Storage: ~/.osaurus/watchers/{uuid}.json
Purpose: Execute complex, multi-step tasks autonomously with built-in issue tracking, planning, and file operations.
Components:
Agent/WorkFolderContext.swift— Folder context models and project detectionAgent/WorkFolderContextService.swift— Folder selection and security-scoped bookmarksAgent/AgentFolderTools.swift— File and shell operation toolsAgent/WorkFileOperation.swift— File operation modelsAgent/WorkFileOperationLog.swift— Operation logging with undo supportModels/WorkModels.swift— Core data models (Issue, WorkTask, LoopState, etc.)Services/WorkEngine.swift— Main task execution coordinatorServices/WorkExecutionEngine.swift— Reasoning loop execution engineManagers/IssueManager.swift— Issue lifecycle and dependency managementStorage/WorkDatabase.swift— SQLite storage for issues, tasks, and conversation turnsTools/WorkTools.swift— Agent-specific tools (complete_task, create_issue, generate_artifact, etc.)Views/WorkView.swift— Main Work Mode UIViews/WorkSession.swift— Observable session state manager
Features:
- Issue Tracking — Tasks broken into issues with status, priority, type, and dependencies
- Parallel Tasks — Run multiple agent tasks simultaneously for increased productivity
- Reasoning Loop — AI autonomously iterates through observe-think-act-check cycles (max 30 iterations)
- Working Directory — Select a folder for file operations with project type detection
- File Operations — Read, write, edit, search, move, copy, delete files with undo support
- Follow-up Issues — Agent creates child issues via
create_issuetool when it discovers additional work - Clarification — Agent pauses to ask when tasks are ambiguous
- Background Execution — Tasks continue running after closing the window
- Token Usage Tracking — Monitor cumulative input/output tokens per task
Issue Properties:
| Property | Description |
|---|---|
status |
open, in_progress, blocked, closed |
priority |
P0 (critical), P1 (high), P2 (medium), P3 (low) |
type |
task, bug, discovery |
title |
Brief description of the work |
description |
Detailed explanation and context |
result |
Outcome after completion |
Available Tools:
| Tool | Description |
|---|---|
file_tree |
List directory structure with filtering |
file_read |
Read file contents (supports line ranges) |
file_write |
Create or overwrite files |
file_edit |
Surgical text replacement within files |
file_search |
Search for text patterns across files |
file_move |
Move or rename files |
file_copy |
Duplicate files |
file_delete |
Remove files |
file_metadata |
Get file information (size, dates, etc.) |
dir_create |
Create directories |
shell_run |
Execute shell commands (requires permission) |
git_status |
Show repository status |
git_diff |
Display file differences |
git_commit |
Stage and commit changes (requires permission) |
Workflow (Reasoning Loop):
- User input creates a task with an initial issue
- Agent enters a reasoning loop (max 30 iterations per issue)
- Each iteration: the model observes context, decides on an action, calls a tool, and evaluates progress
- The model narrates its reasoning and explains actions as it works
- When additional work is found, the agent creates follow-up issues via
create_issue - When the task is complete, the agent calls
complete_taskwith a summary and artifact - Clarification pauses execution when the task is ambiguous
Storage: ~/.osaurus/work/work.db (SQLite)
Purpose: Persist and manage chat conversations with per-session configuration.
Components:
Managers/ChatSessionsManager.swift— Session list managementModels/ChatSessionData.swift— Session data modelModels/ChatSessionStore.swift— Session persistenceViews/Components/ChatSessionSidebar.swift— Session history sidebar
Features:
- Automatic session persistence
- Session history with sidebar navigation
- Per-session model selection
- Per-session tool configuration overrides
- Context token estimation display
- Auto-generated titles from first message
Purpose: Extend Osaurus with custom functionality including tools, HTTP routes, storage, configuration UI, and web apps.
Components:
Tools/OsaurusTool.swift— Tool protocolTools/ExternalTool.swift— External plugin wrapperTools/ToolRegistry.swift— Tool registrationTools/SchemaValidator.swift— JSON schema validationManagers/PluginManager.swift— Plugin discovery, loading, unloadingServices/PluginHostAPI.swift— v2 host API callbacks (config, db, log)Storage/PluginDatabase.swift— Sandboxed per-plugin SQLite databaseModels/PluginHTTP.swift— HTTP request/response models, rate limiter, MIME typesModels/ExternalPlugin.swift— C ABI wrapper with v1/v2 supportViews/PluginConfigView.swift— Native SwiftUI config UI rendererViews/PluginsView.swift— Plugin detail view (README, Settings, Changelog, Routes)
Plugin Types:
- v1 plugins — Tools only, via
osaurus_plugin_entry - v2 plugins — Tools + routes + storage + config, via
osaurus_plugin_entry_v2 - System plugins — Built-in tools (filesystem, browser, git, etc.)
- MCP provider tools — Tools from remote MCP servers
Plugin Capabilities (v2):
| Capability | Manifest Key | Description |
|---|---|---|
| Tools | capabilities.tools |
AI-callable functions |
| Routes | capabilities.routes |
HTTP endpoints (OAuth, webhooks, APIs) |
| Config | capabilities.config |
Native settings UI with validation |
| Web | capabilities.web |
Static frontend serving with context injection |
| Docs | docs |
README, changelog, and external links |
Routes are agent-scoped via the enabledPlugins map on each agent. See PLUGIN_AUTHORING.md for the full reference.
Purpose: Import and manage reusable AI capabilities following the Agent Skills specification.
Components:
Managers/SkillManager.swift— Skill CRUD, persistence, and loadingServices/CapabilityService.swift— Two-phase capability selectionServices/GitHubSkillService.swift— GitHub repository importModels/Skill.swift— Skill data modelModels/CapabilityCatalog.swift— Capability catalog structureTools/SelectCapabilitiesTool.swift— AI tool for selecting capabilitiesViews/SkillsView.swift— Skill management UIViews/SkillEditorSheet.swift— Skill editor
Features:
- GitHub Import — Import from repositories with
.claude-plugin/marketplace.json - File Import — Load
.md(Agent Skills),.json, or.zippackages - Built-in Skills — 6 pre-installed skills for common use cases
- Reference Files — Attach text files loaded into skill context
- Asset Files — Support files for skills
- Categories — Organize skills by type
- Agent Integration — Per-agent skill enable/disable
Two-Phase Capability Selection:
A context optimization system that reduces token usage by ~80%:
| Phase | What's Loaded | Token Usage |
|---|---|---|
| Phase 1 (Selection) | Catalog only (name + description) | ~10-20 tokens per skill |
| Phase 2 (Execution) | Full instructions for selected skills | Full content |
Workflow:
- System prompt includes lightweight capability catalog
- AI calls
select_capabilitieswith desired tools/skills - Full schemas/instructions loaded for selected items only
- Subsequent messages use selected capabilities
Skill Properties:
| Property | Description |
|---|---|
name |
Display name (required) |
description |
Brief description |
instructions |
Full AI instructions (markdown) |
category |
Optional category for organization |
version |
Skill version |
author |
Skill author |
references/ |
Text files loaded into context |
assets/ |
Supporting files |
Storage: ~/.osaurus/skills/{skill-name}/SKILL.md
Purpose: Provide speech-to-text transcription using on-device FluidAudio models.
Components:
Services/SpeechService.swift— Core transcription service with streaming supportManagers/SpeechModelManager.swift— Model download and selectionModels/SpeechConfiguration.swift— Voice input settingsViews/VoiceView.swift— Voice settings UIViews/VoiceSetupTab.swift— Guided setup wizardViews/Components/VoiceInputOverlay.swift— Voice input UI in chat
Features:
- Real-time streaming transcription — See words as you speak
- Multiple Parakeet models — Tiny (75 MB) to Large V3 (3 GB)
- English-only and multilingual — Choose based on your needs
- Microphone input — Built-in or external device selection
- System audio capture — Transcribe computer audio (macOS 12.3+)
- Configurable sensitivity — Low, Medium, High thresholds
- Auto-send with confirmation — Hands-free message sending
- Pause duration control — Adjust silence detection timing
Configuration:
| Setting | Description |
|---|---|
defaultModel |
Selected Parakeet model ID |
languageHint |
ISO 639-1 language code (e.g., "en", "es") |
sensitivity |
Voice detection sensitivity (low/medium/high) |
pauseDuration |
Seconds of silence before auto-send |
confirmationDelay |
Seconds to show confirmation before sending |
selectedInputSource |
Microphone or system audio |
Model Storage: ~/Library/Application Support/FluidAudio/Models/
Purpose: Enable hands-free agent activation through wake-word detection.
Components:
Services/VADService.swift— Always-on listening and wake-word detectionModels/VADConfiguration.swift— VAD settings and enabled agentsViews/ContentView.swift— VAD toggle button in popoverAppDelegate.swift— VAD status indicator in menu bar iconModels/AgentNameDetector.swift— Agent name matching logic
Features:
- Wake-word activation — Say a agent's name to open chat
- Custom wake phrase — Set a phrase like "Hey Osaurus"
- Per-agent enablement — Choose which agents respond to voice
- Menu bar indicator — Shows listening status with audio level
- Auto-start voice input — Begin recording after activation
- Silence timeout — Auto-close chat after inactivity
- Background listening — Continues when chat is closed
Configuration:
| Setting | Description |
|---|---|
vadModeEnabled |
Master toggle for VAD mode |
enabledAgentIds |
UUIDs of agents that respond to wake-words |
customWakePhrase |
Optional phrase like "Hey Osaurus" |
wakeWordSensitivity |
Detection sensitivity level |
autoStartVoiceInput |
Start recording after activation |
silenceTimeoutSeconds |
Auto-close timeout (0 = disabled) |
Workflow:
- VAD listens in background using FluidAudio
- Transcription is checked for agent names or wake phrase
- On match, chat opens with the detected agent
- Voice input starts automatically (if enabled)
- After chat closes, VAD resumes listening
Purpose: Enable global speech-to-text dictation directly into any focused text field using accessibility APIs.
Components:
Services/TranscriptionModeService.swift— Main orchestration serviceServices/KeyboardSimulationService.swift— Simulates keyboard input via CGEventPostServices/TranscriptionOverlayWindowService.swift— Floating overlay panel managementManagers/TranscriptionHotKeyManager.swift— Global hotkey registrationModels/TranscriptionConfiguration.swift— Configuration and persistenceViews/TranscriptionOverlayView.swift— Minimal floating UIViews/TranscriptionModeSettingsTab.swift— Settings UI in Voice tab
Features:
- Global Hotkey — Configurable hotkey to trigger transcription from anywhere
- Live Typing — Transcribed text is typed directly into the focused text field
- Accessibility Integration — Uses macOS accessibility APIs (requires permission)
- Minimal Overlay — Sleek floating UI shows recording status with waveform
- Esc to Cancel — Press Escape or click Done to stop transcription
- Real-time Feedback — Audio level visualization during recording
Configuration:
| Setting | Description |
|---|---|
transcriptionModeEnabled |
Master toggle for transcription mode |
hotkey |
Global hotkey to activate transcription |
Requirements:
- Microphone permission (for audio capture)
- Accessibility permission (for keyboard simulation)
- Parakeet model downloaded
Workflow:
- User presses the configured hotkey
- Overlay appears showing recording status
- FluidAudio transcribes speech in real-time
- Text is typed into the currently focused text field via accessibility APIs
- User presses Esc or clicks Done to stop
- Overlay disappears and transcription ends
Purpose: Persistent, multi-layer memory system that extracts, stores, and retrieves knowledge from conversations to provide personalized, context-aware AI interactions.
Components:
Services/MemoryService.swift— Core actor for conversation processing, extraction, and summarizationServices/MemorySearchService.swift— Hybrid search (BM25 + vector) with MMR rerankingServices/MemoryContextAssembler.swift— Assembles memory context for system prompt injectionStorage/MemoryDatabase.swift— SQLite database with WAL mode and schema migrationsModels/MemoryModels.swift— Data types for all 4 memory layersModels/MemoryConfiguration.swift— User-configurable settings with validationViews/MemoryView.swift— Memory management UI (profile, overrides, agents, statistics, config)
4-Layer Architecture:
| Layer | Type | Purpose | Retention |
|---|---|---|---|
| Layer 1 | User Profile | Auto-generated user summary with version tracking | Permanent |
| Layer 2 | Working Memory | Structured entries (facts, preferences, decisions, etc.) | Per-agent limit (default 500) |
| Layer 3 | Conversation Summaries | Compressed session recaps | Configurable (default 7 days) |
| Layer 4 | Conversation Chunks | Raw conversation turns | Permanent |
Memory Entry Types:
| Type | Description |
|---|---|
fact |
Factual information about the user or their environment |
preference |
User preferences and likes/dislikes |
decision |
Decisions made during conversations |
correction |
Corrections to previous information |
commitment |
Promises or plans the user mentioned |
relationship |
Relationships between people, projects, or concepts |
skill |
Skills, expertise, or knowledge areas |
Knowledge Graph:
- Entities: person, company, place, project, tool, concept, event
- Relationships with confidence scores and temporal validity
- Graph traversal search by entity name or relation type
Search & Retrieval:
| Method | Backend | Fallback |
|---|---|---|
| Hybrid search | VecturaKit (BM25 + vector embeddings) | SQLite LIKE queries |
| MMR reranking | Jaccard similarity for diversity | N/A |
- Default MMR lambda: 0.7 (relevance vs. diversity tradeoff)
- Default fetch multiplier: 2.0x over-fetch before reranking
Verification Pipeline (3 layers, no LLM calls):
| Layer | Method | Threshold |
|---|---|---|
| Layer 1 | Jaccard word-overlap deduplication | 0.6 |
| Layer 2 | Contradiction detection (same type + similarity) | 0.3 |
| Layer 3 | Semantic similarity via vector search | 0.85 |
Context Assembly:
Memory is injected into system prompts in this order with per-section token budgets:
| Section | Default Budget |
|---|---|
| User Overrides | (always included) |
| User Profile | 2,000 tokens |
| Working Memory | 500 tokens |
| Conversation Summaries | 1,000 tokens |
| Key Relationships | 300 tokens |
Results are cached for 10 seconds per agent.
Resilience:
- Circuit breaker: opens after 5 consecutive failures, 60-second cooldown
- Retry logic: exponential backoff (1s, 2s, 4s), max 3 retries, 60-second timeout
- Actor-based concurrency for thread safety
- Non-blocking: all extraction runs in the background
Configuration:
| Setting | Default | Range |
|---|---|---|
coreModelProvider |
anthropic |
Any provider |
coreModelName |
claude-haiku-4-5 |
Any model |
embeddingBackend |
mlx |
mlx, none |
embeddingModel |
nomic-embed-text-v1.5 |
Any embedding model |
summaryDebounceSeconds |
60 | 10 -- 3,600 |
profileMaxTokens |
2,000 | 100 -- 50,000 |
profileRegenerateThreshold |
10 | 1 -- 100 |
workingMemoryBudgetTokens |
500 | 50 -- 10,000 |
summaryRetentionDays |
7 | 1 -- 365 |
summaryBudgetTokens |
1,000 | 50 -- 10,000 |
graphBudgetTokens |
300 | 50 -- 5,000 |
recallTopK |
10 | 1 -- 100 |
mmrLambda |
0.7 | 0.0 -- 1.0 |
mmrFetchMultiplier |
2.0 | 1.0 -- 10.0 |
maxEntriesPerAgent |
500 | 0 -- 10,000 |
enabled |
true | true/false |
Storage: ~/.osaurus/memory/memory.sqlite (SQLite with WAL mode)
| Document | Purpose |
|---|---|
| README.md | Project overview, quick start, feature highlights |
| FEATURES.md | Feature inventory and architecture (this file) |
| WATCHERS.md | Watchers and folder monitoring guide |
| WORK.md | Agents and autonomous task execution guide |
| REMOTE_PROVIDERS.md | Remote provider setup and configuration |
| REMOTE_MCP_PROVIDERS.md | Remote MCP provider setup |
| DEVELOPER_TOOLS.md | Insights and Server Explorer guide |
| VOICE_INPUT.md | Voice input, FluidAudio, and VAD mode guide |
| SKILLS.md | Skills and capability selection guide |
| MEMORY.md | Memory system and configuration guide |
| PLUGIN_AUTHORING.md | Creating custom plugins |
| OpenAI_API_GUIDE.md | API usage, tool calling, streaming |
| SHARED_CONFIGURATION_GUIDE.md | Shared configuration for teams |
| CONTRIBUTING.md | Contribution guidelines |
| SECURITY.md | Security policy |
| CODE_OF_CONDUCT.md | Community standards |
| SUPPORT.md | Getting help |
When adding a new feature:
- Add a row to the Feature Matrix with status, README section, documentation, and code location
- Add a Feature Details section if the feature is significant
- Update the Architecture Overview if the feature adds new components
- Update the Documentation Index if new docs are created
- Update the README if the feature should be highlighted
When modifying an existing feature:
- Update the relevant row in the Feature Matrix
- Update any affected documentation files
- Note breaking changes in the feature's documentation
| Status | Meaning |
|---|---|
| Stable | Production-ready, fully documented |
| Beta | Functional but API may change |
| Experimental | Work in progress, use with caution |
| Deprecated | Scheduled for removal, migrate away |
| macOS 26+ | Requires macOS 26 (Tahoe) or later |