Skip to content

Commit 00dc18c

Browse files
Merge pull request RunanywhereAI#297 from vedantagarwal-web/add-playground-projects
Add Playground folder with swift-starter-app and on-device-browser-agent
2 parents 2f7cb15 + 4383441 commit 00dc18c

File tree

81 files changed

+20411
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+20411
-0
lines changed

Playground/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Playground
2+
3+
Interactive demo projects showcasing what you can build with RunAnywhere.
4+
5+
| Project | Description | Platform |
6+
|---------|-------------|----------|
7+
| [swift-starter-app](swift-starter-app/) | Privacy-first AI demo — LLM Chat, Speech-to-Text, Text-to-Speech, and Voice Pipeline with VAD | iOS (Swift/SwiftUI) |
8+
| [on-device-browser-agent](on-device-browser-agent/) | On-device AI browser automation using WebLLM — no cloud, no API keys, fully private | Chrome Extension (TypeScript/React) |
9+
10+
## swift-starter-app
11+
12+
A full-featured iOS app demonstrating the RunAnywhere SDK's core capabilities:
13+
14+
- **LLM Chat** — On-device conversation with local language models
15+
- **Speech-to-Text** — Whisper-powered transcription
16+
- **Text-to-Speech** — Neural voice synthesis
17+
- **Voice Pipeline** — Integrated STT → LLM → TTS with Voice Activity Detection
18+
19+
**Requirements:** iOS 17.0+, Xcode 15.0+
20+
21+
## on-device-browser-agent
22+
23+
A Chrome extension that automates browser tasks entirely on-device using WebLLM and WebGPU:
24+
25+
- **Two-agent architecture** — Planner + Navigator for intelligent task execution
26+
- **DOM and Vision modes** — Text-based or screenshot-based page understanding
27+
- **Site-specific handling** — Optimized workflows for Amazon, YouTube, and more
28+
- **Fully offline** — All AI inference runs locally on GPU after initial model download
29+
30+
**Requirements:** Chrome 124+ (WebGPU support)
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Dependencies
2+
node_modules/
3+
4+
# Build output
5+
dist/
6+
7+
# IDE
8+
.idea/
9+
.vscode/
10+
*.swp
11+
*.swo
12+
13+
# OS
14+
.DS_Store
15+
Thumbs.db
16+
17+
# Logs
18+
*.log
19+
npm-debug.log*
20+
21+
# Environment
22+
.env
23+
.env.local
24+
.env.*.local
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2024 Local Browser Contributors
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Local Browser - On-Device AI Web Automation
2+
3+
# Launching support for runanywhere-web-sdk soon in our main repo: please go check it out: https://github.com/RunanywhereAI/runanywhere-sdks
4+
5+
A Chrome extension that uses WebLLM to run AI-powered web automation entirely on-device. No cloud APIs, no API keys, fully private.
6+
7+
## Demo
8+
9+
https://github.com/user-attachments/assets/898cc5c2-db77-4067-96e6-233c5da2bae5
10+
11+
12+
## Features
13+
14+
- **On-Device AI**: Uses WebLLM with WebGPU acceleration for local LLM inference
15+
- **Multi-Agent System**: Planner + Navigator agents for intelligent task execution
16+
- **Browser Automation**: Navigate, click, type, extract data from web pages
17+
- **Privacy-First**: All AI runs locally, no data leaves your device
18+
- **Offline Support**: Works offline after initial model download
19+
20+
## Quick Start
21+
22+
### Prerequisites
23+
24+
- **Chrome 124+** (required for WebGPU in service workers)
25+
- **Node.js 18+** and npm
26+
- **GPU with WebGPU support** (most modern GPUs work)
27+
28+
### Installation
29+
30+
1. **Clone and install dependencies**:
31+
```bash
32+
cd local-browser
33+
npm install
34+
```
35+
36+
2. **Build the extension**:
37+
```bash
38+
npm run build
39+
```
40+
41+
3. **Load in Chrome**:
42+
- Open `chrome://extensions`
43+
- Enable "Developer mode" (top right)
44+
- Click "Load unpacked"
45+
- Select the `dist` folder from this project
46+
47+
4. **First run**:
48+
- Click the extension icon in your toolbar
49+
- The first run will download the AI model (~1GB)
50+
- This is cached for future use
51+
52+
### Usage
53+
54+
1. Navigate to any webpage
55+
2. Click the Local Browser extension icon
56+
3. Type a task like:
57+
- "Search for 'WebGPU' on Wikipedia and extract the first paragraph"
58+
- "Go to example.com and tell me what's there"
59+
- "Find the search box and search for 'AI news'"
60+
4. Watch the AI execute the task step by step
61+
62+
## Development
63+
64+
### Development Mode
65+
66+
```bash
67+
npm run dev
68+
```
69+
70+
This watches for changes and rebuilds automatically.
71+
72+
### Project Structure
73+
74+
```
75+
local-browser/
76+
├── manifest.json # Chrome extension manifest (MV3)
77+
├── src/
78+
│ ├── background/ # Service worker
79+
│ │ ├── index.ts # Entry point & message handling
80+
│ │ ├── llm-engine.ts # WebLLM wrapper
81+
│ │ └── agents/ # AI agent system
82+
│ │ ├── base-agent.ts
83+
│ │ ├── planner-agent.ts
84+
│ │ ├── navigator-agent.ts
85+
│ │ └── executor.ts
86+
│ ├── content/ # Content scripts
87+
│ │ ├── dom-observer.ts # Page state extraction
88+
│ │ └── action-executor.ts
89+
│ ├── popup/ # React popup UI
90+
│ │ ├── App.tsx
91+
│ │ └── components/
92+
│ └── shared/ # Shared types & constants
93+
└── dist/ # Build output
94+
```
95+
96+
### How It Works
97+
98+
1. **User enters a task** in the popup UI
99+
2. **Planner Agent** analyzes the task and creates a high-level strategy
100+
3. **Navigator Agent** examines the current page DOM and decides on the next action
101+
4. **Content Script** executes the action (click, type, extract, etc.)
102+
5. Loop continues until task is complete or fails
103+
104+
### Agent System
105+
106+
The extension uses a two-agent architecture inspired by Nanobrowser:
107+
108+
- **PlannerAgent**: Strategic planning, creates step-by-step approach
109+
- **NavigatorAgent**: Tactical execution, chooses specific actions based on page state
110+
111+
Both agents output structured JSON that is parsed and executed.
112+
113+
## Model Configuration
114+
115+
Default model: `Qwen2.5-1.5B-Instruct-q4f16_1-MLC` (~1GB)
116+
117+
Alternative models (configured in `src/shared/constants.ts`):
118+
- `Phi-3.5-mini-instruct-q4f16_1-MLC` (~2GB, better reasoning)
119+
- `Llama-3.2-1B-Instruct-q4f16_1-MLC` (~0.7GB, smaller)
120+
121+
## Troubleshooting
122+
123+
### WebGPU not supported
124+
- Update Chrome to version 124 or later
125+
- Check `chrome://gpu` to verify WebGPU status
126+
- Some GPUs may not support WebGPU
127+
128+
### Model fails to load
129+
- Ensure you have enough disk space (~2GB free)
130+
- Check browser console for errors
131+
- Try clearing the extension's storage and reloading
132+
133+
### Actions not executing
134+
- Some pages block content scripts (chrome://, extension pages)
135+
- Try on a regular webpage like wikipedia.org
136+
137+
### Extension not working after Chrome update
138+
- Go to `chrome://extensions`
139+
- Click the reload button on the extension
140+
141+
## Limitations
142+
143+
- **POC Scope**: This is a proof-of-concept, not production software
144+
- **No Vision**: Uses text-only DOM analysis (no screenshot understanding)
145+
- **Single Tab**: Only works with the currently active tab
146+
- **Basic Actions**: Supports navigate, click, type, extract, scroll, wait
147+
- **Model Size**: Smaller models may struggle with complex tasks
148+
149+
## Tech Stack
150+
151+
- **WebLLM**: On-device LLM inference with WebGPU
152+
- **React**: Popup UI
153+
- **TypeScript**: Type-safe development
154+
- **Vite + CRXJS**: Chrome extension bundling
155+
- **Chrome Extension Manifest V3**: Modern extension architecture
156+
157+
## Credits
158+
159+
This project is inspired by:
160+
- [Nanobrowser](https://github.com/nanobrowser/nanobrowser) - Multi-agent web automation (MIT License)
161+
- [WebLLM](https://github.com/mlc-ai/web-llm) - In-browser LLM inference (Apache-2.0 License)
162+
163+
### Dependency Licenses
164+
165+
| Package | License |
166+
|---------|---------|
167+
| @mlc-ai/web-llm | Apache-2.0 |
168+
| React | MIT |
169+
| Vite | MIT |
170+
| @crxjs/vite-plugin | MIT |
171+
| TypeScript | Apache-2.0 |
172+
173+
## License
174+
175+
MIT License - See LICENSE file for details.
3.43 MB
Binary file not shown.
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
{
2+
"manifest_version": 3,
3+
"name": "Local Browser - AI Web Agent",
4+
"version": "0.1.0",
5+
"description": "On-device AI web automation using WebLLM. No cloud, no API keys, fully private.",
6+
"permissions": [
7+
"storage",
8+
"scripting",
9+
"activeTab",
10+
"tabs",
11+
"offscreen"
12+
],
13+
"host_permissions": [
14+
"<all_urls>",
15+
"https://huggingface.co/*",
16+
"https://cdn-lfs.huggingface.co/*",
17+
"https://cdn-lfs-us-1.huggingface.co/*",
18+
"https://raw.githubusercontent.com/*"
19+
],
20+
"content_security_policy": {
21+
"extension_pages": "script-src 'self' 'wasm-unsafe-eval'; object-src 'self';"
22+
},
23+
"background": {
24+
"service_worker": "src/background/index.ts",
25+
"type": "module"
26+
},
27+
"action": {
28+
"default_popup": "src/popup/index.html",
29+
"default_title": "Local Browser",
30+
"default_icon": {
31+
"16": "public/icons/icon16.png",
32+
"48": "public/icons/icon48.png",
33+
"128": "public/icons/icon128.png"
34+
}
35+
},
36+
"content_scripts": [
37+
{
38+
"matches": ["<all_urls>"],
39+
"js": ["src/content/index.ts"],
40+
"run_at": "document_idle"
41+
}
42+
],
43+
"icons": {
44+
"16": "public/icons/icon16.png",
45+
"48": "public/icons/icon48.png",
46+
"128": "public/icons/icon128.png"
47+
},
48+
"minimum_chrome_version": "124"
49+
}

0 commit comments

Comments
 (0)