Code Bench

Desktop AI coding assistant for local repositories. Bring your own model — Anthropic, OpenAI, Gemini, and Ollama work out of the box, or point it at any OpenAI-compatible custom endpoint. Chat over your repo, run your tools, edit files in place, and watch git state update inline.

Installation (macOS)

Download CodeBench-macos.dmg from the latest release.
Open the DMG and drag Code Bench into Applications.
Launch the app from Applications.

When you point Code Bench at a project under Documents, Downloads, or Desktop, macOS will ask "Code Bench would like to access files in your … folder." Click Allow. Code Bench reads project files from wherever you store them on disk, so it needs access to those user folders.

Features

The app is chat-centric: a single conversation surface with a project sidebar on the left, an inline changes panel that surfaces the agent's edits as they happen, and a top action bar for the active project's branch and PR state. Settings (⌘,) hosts the configuration sub-areas.

Surface	Capabilities
Chat	Streaming responses · per-session system prompt · model selector · agent loop with tool use · interactive permission prompts · "ask the user a question" cards
Project sidebar	Add/relocate local projects · session list per project · session archive · live git state (branch, ahead/behind, dirty-tree) · branch picker
Changes panel	Inline diff per edited file · per-change accept/reject · conflict-merge view when on-disk drifts from agent edits · commit dialog · create-PR dialog
Coding tools	Built-in tool registry (filesystem read/write, ripgrep, bash, web fetch) · per-tool denylist · ripgrep auto-detect · ready for MCP-server tool sources
MCP servers	Configure stdio and HTTP/SSE MCP servers · enable/disable per server · tool inventory surfaces in chat
Integrations	GitHub sign-in (Device Flow) · repository browser feeding the project sidebar
Providers	Multi-provider key storage (OpenAI · Anthropic · Gemini · Ollama · custom OpenAI-compatible endpoint) · per-provider connectivity test · keys in OS keychain
Settings → Reset	"Wipe all data" — clears API keys, GitHub sign-in, chat history, projects, and MCP servers in one step
Auto-update	Checks the GitHub Releases endpoint on launch and from Settings · verifies Team-ID match and `codesign`/`spctl` on macOS · self-installs and relaunches

Platforms

Platform	Status
macOS	✅ Supported — built, signed, notarized, and released in CI on every tag
Windows	⚠️ Unsupported — Flutter build target exists, but no CI, no signing, no released binaries. Use at your own risk and expect to fix things.
Linux	⚠️ Unsupported — same caveats as Windows. Both matrix entries are commented out in `.github/workflows/build.yml` and `.github/workflows/build-and-publish.yml`.

iOS, Android, and Web are out of scope.

If you want Windows or Linux to be a supported platform, the path is: re-enable the matrix entries in build.yml and build-and-publish.yml, add platform-appropriate code-signing, fix anything that breaks, and update this section. Until that happens, treat the desktop builds for those targets as a developer-only escape hatch.

Requirements

Dependency	Version
Flutter SDK	≥ 3.41.6 stable
Dart SDK	≥ 3.11.4
Xcode (macOS builds)	15+ Sequoia (Xcode CLI tools required)
Windows 10 (Windows builds)	1903+ (in development)
GTK 3 + ninja + cmake (Linux builds)	system packages (in development)

1. Clone and fetch packages

git clone git@github.com:mkappworks-dev/code-bench-app.git
cd code-bench-app
flutter pub get

2. Generate code

Drift (SQLite ORM) and Riverpod require a one-time code-generation step. Run this before the first build and whenever you modify database tables or add @riverpod providers:

dart run build_runner build --delete-conflicting-outputs

Use watch mode during active development:

dart run build_runner watch --delete-conflicting-outputs

3. Run

For macos:

flutter run -d macos      # primary dev target

For windows:

flutter run -d windows

For linux:

flutter run -d linux

On first launch, the onboarding screen gates access until at least one AI provider API key is saved.

GitHub sign-in — Code Bench uses the OAuth 2.0 Device Authorization Grant (RFC 8628) on the Benchlabs Codebench GitHub App. The app's client_id is checked into source at ApiConstants.githubClientId and shipped in the binary — that's intentional. Device Flow treats client_id as a non-secret (the same reason there is no client_secret to ship), so embedding it carries no credential-leak risk; see docs/superpowers/specs/2026-05-03-github-app-device-flow-design.md for the full threat model.

Forks must register their own GitHub App and replace ApiConstants.githubClientId. To register: github.com → Settings → Developer settings → GitHub Apps → New GitHub App, tick Enable Device Flow at the bottom, and copy the resulting Iv23li… Client ID over the embedded value.

Project Structure

lib/
├── main.dart                    # Entry point — ProviderScope, window_manager init
├── app.dart                     # MaterialApp.router wired to GoRouter
├── router/
│   └── app_router.dart          # GoRouter: onboarding guard + chat ShellRoute + settings route
├── shell/
│   ├── chat_shell.dart          # Sidebar + chat column + optional changes panel; ⌘N / ⌘, shortcuts
│   ├── notifiers/               # Top-action-bar and status-bar state
│   └── widgets/                 # AppLifecycleObserver, TopActionBar, StatusBar, ActionOutputPanel
├── core/                        # Constants, AppException hierarchy, theme/colors, utils, shared widgets
├── data/
│   ├── _core/                   # Drift AppDatabase, DioFactory, SecureStorage, preferences
│   ├── shared/                  # Cross-cutting models: AIModel, ChatMessage
│   ├── ai/                      # AI datasources (Dio), repository, models/
│   ├── session/                 # Session datasource (Drift), repository, models/ (ChatSession, ToolEvent, …)
│   ├── project/                 # Project datasource (Drift), repository, models/ (Project, WorkspaceProject, …)
│   ├── git/                     # Git datasource (Process), live-state datasource, repository, models/, exceptions
│   ├── github/                  # GitHub datasources (Dio + OAuth), repository, models/
│   ├── apply/                   # Apply datasource (filesystem), repository, security guard
│   ├── filesystem/              # Filesystem datasource (dart:io)
│   ├── bash/                    # Bash datasource (Process) — the one documented `runInShell` exception
│   ├── coding_tools/            # Tool inputs/outputs, denylist, registry-facing types
│   ├── mcp/                     # MCP config datasource (Drift), transport datasources (stdio + HTTP/SSE), repository, models/
│   ├── web_fetch/               # Web-fetch datasource (Dio)
│   ├── providers/               # Provider catalog + ProvidersService backing
│   ├── settings/                # Settings datasource (Drift + SharedPreferences), repository, models/
│   ├── update/                  # Update datasources (Dio for releases, Process for install, IO for sentinel), models/
│   └── integrations/            # Integration metadata (GitHub OAuth)
├── services/
│   ├── ai/                      # AIService — stream buffering, model resolution
│   ├── agent/                   # Agent loop — tool dispatch, permission prompts, iteration cap
│   ├── coding_tools/            # ToolRegistry, denylist service, ripgrep availability probe, individual tools/
│   ├── mcp/                     # MCP service — server lifecycle, tool inventory
│   ├── git/                     # GitService — composite git operations
│   ├── github/                  # GitHubService — OAuth + REST composition
│   ├── session/                 # SessionService — send-and-stream, history, archive
│   ├── project/                 # ProjectService — add/relocate, scan
│   ├── apply/                   # ApplyService — patch orchestration + security guard
│   ├── providers/               # ProvidersService — keychain-backed key storage
│   ├── api_key_test/            # ApiKeyTestService — provider connectivity checks
│   ├── ide/                     # IdeService — editor/terminal launch
│   ├── settings/                # SettingsService — wipe cascade, onboarding
│   └── update/                  # UpdateService — version comparison, codesign/spctl gates, swap-and-relaunch
└── features/
    ├── onboarding/              # First-run wizard (API keys, GitHub sign-in)
    ├── chat/                    # Chat UI, message streaming, agent permission prompts, code-apply actions
    ├── project_sidebar/         # Project list, session list, archive, branch picker triggers
    ├── branch_picker/           # Branch picker dialog + notifier
    ├── archive/                 # Archived sessions screen
    ├── general/                 # Settings → General (preferences, update section, reset section)
    ├── providers/               # Settings → Providers (per-provider keys + test)
    ├── integrations/            # Settings → Integrations (GitHub sign-in)
    ├── coding_tools/            # Settings → Coding Tools (denylist, ripgrep status)
    ├── mcp_servers/             # Settings → MCP Servers (configure, enable/disable)
    ├── update/                  # Update notifier, state, failure types, "Check now" UI
    └── settings/                # Settings shell + sub-area router

Architecture

Dependency rule

The dependency graph is strictly one-directional. Violating it is a build-review blocker:

Widgets / Screens
      ↓  (ref.watch / ref.read notifier)
  Notifiers          ← the only layer widgets may reach
      ↓  (ref.read service)
  Services           ← business logic, composition, typed exceptions
      ↓  (constructor injection)
  Repositories       ← domain interfaces; no I/O
      ↓
  Datasources        ← Dio, DB, Process.run, filesystem live here
      ↓
External (REST APIs / SQLite / OS)

Widgets communicate with notifiers only via ref.watch / ref.read(…notifier).method(). They never reach into a service or repository provider directly. Process.run, dart:io, and Dio are confined to lib/data/**/datasource/.

Command notifiers (*Actions, e.g. ProjectSidebarActions, CodeApplyActions, GitActions) use void build() with keepAlive: true and expose imperative Future<void> methods. They are the bridge between the UI and the service layer.

Naming conventions:

Layer	Rule
Service class	ends in `Service` (`GitService`, `SessionService`)
Service provider	`@riverpod` function placed before the class it instantiates
Repository interface	ends in `Repository` (`GitRepository`, `AIRepository`)
Repository impl + provider	class ends in `RepositoryImpl`; `@riverpod` before it
Datasource file naming	suffix encodes I/O type: `_dio.dart`, `_process.dart`, `_io.dart`, `_drift.dart`
Command notifier	ends in `Actions`; `void build()`, `keepAlive: true`
State notifier	ends in `Notifier`; owns `AsyncValue` or value state
Notifier file placement	`_notifier.dart`, `_actions.dart`, and `*_failure.dart` all live in `{feature}/notifiers/`

The Riverpod generator strips the Notifier suffix from provider names (ActiveSessionIdNotifier → activeSessionIdProvider). The Actions suffix is kept (GitActions → gitActionsProvider). Widgets must never call ref.invalidate directly — route through a notifier method instead.

Layered architecture

Widgets are pure state-renderers. They call notifier methods and listen for AsyncError state to show snackbars — they never try/catch business-logic calls or import service/repository exception types.

Notifiers mediate all commands. *Actions notifiers extend AsyncNotifier<void>; failures are emitted as AsyncError carrying a typed sealed class {Notifier}Failure. *Notifier classes own reactive AsyncValue<T> data state.

Services own business logic and composition. They receive repositories via constructor injection, convert low-level I/O errors into typed domain exceptions, and expose a clean API to notifiers. Services are instantiated via @riverpod / @Riverpod(keepAlive: true) providers and never constructed directly.

Repositories are domain interfaces (lib/data/**/repository/). Implementations (*RepositoryImpl) are wired up via Riverpod providers and injected into services.

Datasources (lib/data/**/datasource/) are where all I/O lives: Dio HTTP calls, SQLite via Drift, Process.run, and dart:io filesystem access. File suffix encodes the I/O type: *_dio.dart, *_process.dart, *_io.dart, *_drift.dart.

The full rules — naming conventions, error-handling patterns, logging matrix, security guards — are in CLAUDE.md.

State management

Pattern	Used for
`@Riverpod(keepAlive: true)` class Notifier	Long-lived app state: active session ID, active project ID, selected model, system prompts, DB, storage
`@Riverpod(keepAlive: true)` class Actions	Imperative commands: `*Actions` notifiers expose `Future<void>` methods that mediate widget → service calls (e.g. `CodeApplyActions`, `ProjectSidebarActions`, `GitActions`)
`@riverpod` class AsyncNotifier	Chat messages (loads history, streams new messages)
`@riverpod` function (StreamProvider)	Session list, live git state, MCP server list — wraps Drift / Process stream sources
`@riverpod` function (FutureProvider)	One-shot reads: available model list, package version, last update-check timestamp

Local persistence

All data is stored in a local SQLite database managed by Drift (code_bench.db).

Table	Stores
`ChatSessions`	Session ID · title · model/provider · created/updated timestamps · pin flag · archive flag
`ChatMessages`	Message ID · session FK · role · content · extracted code blocks (JSON) · tool events (JSON) · timestamp
`WorkspaceProjects`	Project ID · name · local path · linked repo ID · active branch · associated session IDs
`McpServers`	Server ID · name · transport (stdio / HTTP-SSE) · command + args · env (JSON) · URL · enabled flag

DAOs: SessionDao (sessions + messages CRUD, stream watch) · ProjectDao (projects CRUD) · McpDao (servers CRUD, including deleteAll for the wipe cascade).

Secret storage

SecureStorageSource wraps flutter_secure_storage using a consistent key scheme:

Key	Holds
`api_key_{provider}`	API key per AI provider (e.g. `api_key_openai`)
`github_token`	GitHub OAuth access token
`ollama_base_url`	Custom Ollama server URL
`custom_endpoint_url`	OpenAI-compatible custom endpoint
`custom_endpoint_api_key`	Key for the custom endpoint

Platform	Backend
macOS	Keychain (`first_unlock` accessibility)
Windows	Windows Credential Manager
Linux	libsecret

Building for Distribution

for macos:

flutter build macos --release   # → build/macos/Build/Products/Release/

macOS App Sandbox is intentionally disabled. Code Bench shells out to git, code, cursor, and user-defined action commands, which cannot work under sandbox. See macos/Runner/README.md for the rationale, contributor rules, and distribution implications (Mac App Store eligibility, hardened runtime, notarization).

for windows:

flutter build windows --release # → build/windows/x64/runner/Release/ (unsupported)

for linux:

flutter build linux --release   # → build/linux/x64/release/bundle/ (unsupported)

Releasing (macOS)

Releases are managed by release-please. Every merge to main updates an open release PR that bumps pubspec.yaml, writes CHANGELOG.md, and proposes the next semver version based on conventional commit types (feat: → minor, fix: → patch, feat!: / BREAKING CHANGE: → major). Merging that PR triggers .github/workflows/release-please.yml, which creates a draft GitHub release and chains .github/workflows/build-and-publish.yml to build the macOS app, sign with a Developer ID, notarize through Apple's notary service, staple the ticket, and upload CodeBench-macos.dmg and CodeBench-macos.zip to the draft. The chained workflow then publishes the draft (PATCH draft=false), which is the moment the v* git tag gets created and the release becomes "latest." The in-app auto-updater consumes those artifacts on next launch of older clients.

Required GitHub Actions secrets

Add these under Settings → Secrets and variables → Actions before the first release:

Secret	Holds	How to get it
`MACOS_CERTIFICATE`	Base64-encoded Developer ID Application certificate (`.p12`)	Export from Keychain Access (right-click identity → Export → `.p12`), then `base64 -i cert.p12 \| pbcopy` and paste
`MACOS_CERTIFICATE_PASSWORD`	Password set when exporting the `.p12`	The password you typed at export time
`MACOS_PROVISIONING_PROFILE`	Base64-encoded Developer ID provisioning profile	See macos/Runner/README.md for the full Developer Portal walkthrough
`APPLE_ID`	Apple ID email of the notarizing account	The email tied to your Apple Developer membership
`APPLE_ID_PASSWORD`	App-specific password — not your Apple ID password	appleid.apple.com → Sign-In and Security → App-Specific Passwords → Generate (label e.g. `code-bench-notarize`)
`APPLE_TEAM_ID`	10-character Team ID (also used as Xcode `DEVELOPMENT_TEAM`)	developer.apple.com/account → Membership Details
`RELEASE_PLEASE_TOKEN`	Personal access token (classic) with `repo` scope	github.com/settings/tokens → Generate new token (classic) → check `repo` → no expiry

Why a PAT for release-please? PRs created by the default GITHUB_TOKEN are blocked from triggering other workflows (GitHub's anti-loop protection). Without a PAT, the release PR's required status checks (Analyze & Test, Build (macos)) get stuck on "Expected — Waiting for status to be reported" and never run. A PAT makes the PR appear as user-created so CI fires normally.

Cutting a release

Merge feature/fix PRs to main as normal — use Conventional Commits (feat:, fix:, etc.).
release-please keeps a release PR open that accumulates all pending commits.
When you're ready to ship, merge the release PR.
CI tags, builds, notarizes, and publishes automatically — no manual steps needed.

Never manually bump pubspec.yaml or push v* tags. release-please owns both. Manual bumps or tags will confuse the manifest and produce duplicate or mis-versioned releases.

Manual recovery (`workflow_dispatch`)

The Release (build + publish) workflow has a Run workflow button under Actions tab → Release (build + publish). Use it for the recovery scenarios below — it's the safety valve when the chained automatic flow fails or you need to operate on an existing release out-of-band. Do not use it for normal releases; merging a Release Please PR is what cuts a release.

Inputs:

Input	When to set it
`tag`	Always — the tag string, e.g. `v0.2.0`
`release_id`	Only when publishing a stuck draft; otherwise blank

Scenario A — stuck draft (no tag yet). release-please.yml ran but the chained run failed mid-way (build crashed, notarization timeout, upload failed). Result: a draft release exists in the Releases tab with a "Draft" badge, but no v* tag was created.

Open the draft release; copy the release ID from the URL (releases/edit/<id>).
Actions tab → Release (build + publish) → Run workflow.
Enter the tag (e.g. v0.2.0) and the release ID → Run.

The workflow re-runs the build, deletes any half-uploaded assets, re-uploads, and flips draft=false to publish — which is what finally creates the git tag.

Scenario B — re-upload assets to a published release. The release shipped, but the DMG was discovered to be corrupted, or you want to attach an additional file. Tag exists, release is non-draft.

Actions tab → Release (build + publish) → Run workflow.
Enter the tag, leave release ID blank → Run.

The workflow checks out the tag, builds fresh, and softprops uploads (replacing same-named assets). The changelog is not overwritten.

Scenario C — orphaned tag with no release object. Rare. Someone pushed git tag v0.2.0 && git push origin v0.2.0 outside the Release Please flow. Tag exists, no release object yet.

Same steps as Scenario B (tag only, blank release ID). softprops creates the release for that tag and uploads artifacts.

Scenario D — testing pipeline changes on an existing tag. You modified the build / sign / notarize steps in build-and-publish.yml and want to verify on a real tag without merging a Release Please PR.

Same steps as Scenario B. Be aware: this will replace existing assets on that release — pick a throwaway test tag if you don't want to disturb a real release.

Testing & Linting

flutter test                         # run all tests
flutter analyze                      # static analysis
dart format lib/ test/               # format
dart format --set-exit-if-changed lib/ test/   # CI format check

Extending Code Bench

Adding an AI provider

Add a value to the AIProvider enum in lib/data/shared/ai_model.dart.
Implement the streaming sendMessage path under lib/data/ai/datasource/ (Dio for HTTP, *_dio.dart suffix) and surface it through AIRepository / AIService in lib/services/ai/ai_service.dart.
Add the per-provider key plumbing in lib/data/_core/secure_storage.dart and the corresponding entry in ProvidersService (lib/services/providers/providers_service.dart).
Wire the connectivity test in lib/services/api_key_test/.
Add the row to the Settings → Providers UI under lib/features/providers/.

Adding a Drift table

Define the table class in lib/data/_core/app_database.dart.
Create a @DriftAccessor DAO class in the same file (include a deleteAll method so the table participates in SettingsService.wipeAllData).
Add both to the @DriftDatabase annotation and the daos list.
Increment schemaVersion and add a migration step.
Run dart run build_runner build --delete-conflicting-outputs.
If the table holds user data, add a wipe step to lib/services/settings/settings_service.dart so "Wipe all data" stays exhaustive.

Tech Stack

Layer	Technology
UI	Flutter · Material Design · Google Fonts
State	flutter_riverpod · riverpod_annotation
Navigation	go_router (ShellRoute)
Local DB	Drift (SQLite via sqlite3_flutter_libs)
Secret storage	flutter_secure_storage
HTTP / streaming	Dio (SSE via `ResponseType.stream`)
AI providers	OpenAI · Anthropic · Gemini · Ollama · Custom (OpenAI-compatible)
Tool sources	Built-in registry (filesystem, ripgrep, bash, web fetch) · MCP (stdio + HTTP/SSE)
Chat rendering	flutter_markdown_plus · flutter_highlight
GitHub auth	OAuth 2.0 Device Flow (RFC 8628) on a GitHub App — public `client_id` only
Self-update	GitHub Releases API · `codesign --verify` + `spctl --assess` · swap-and-relaunch helper
Preferences	shared_preferences (NSUserDefaults / equivalents)
Window management	window_manager
Serialization	freezed · json_annotation
Code generation	build_runner · riverpod_generator · drift_dev · freezed · json_serializable

Contributing

Contributions are welcome. Please read CONTRIBUTING.md before opening a PR.

Security

To report a vulnerability, see SECURITY.md.

License

MIT — free to use, modify, and distribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code Bench

Installation (macOS)

Features

Platforms

Requirements

1. Clone and fetch packages

2. Generate code

3. Run

Project Structure

Architecture

Dependency rule

Layered architecture

State management

Local persistence

Secret storage

Building for Distribution

Releasing (macOS)

Required GitHub Actions secrets

Cutting a release

Manual recovery (`workflow_dispatch`)

Testing & Linting

Extending Code Bench

Adding an AI provider

Adding a Drift table

Tech Stack

Contributing

Security

License

About

Uh oh!

Releases 20

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 899 Commits
.github		.github
.vscode		.vscode
assets/images		assets/images
docs/superpowers		docs/superpowers
lib		lib
linux		linux
macos		macos
test		test
windows		windows
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
analysis_options.yaml		analysis_options.yaml
build.yaml		build.yaml
pubspec.lock		pubspec.lock
pubspec.yaml		pubspec.yaml
release-please-config.json		release-please-config.json

Folders and files

Latest commit

History

Repository files navigation

Code Bench

Installation (macOS)

Features

Platforms

Requirements

1. Clone and fetch packages

2. Generate code

3. Run

Project Structure

Architecture

Dependency rule

Layered architecture

State management

Local persistence

Secret storage

Building for Distribution

Releasing (macOS)

Required GitHub Actions secrets

Cutting a release

Manual recovery (workflow_dispatch)

Testing & Linting

Extending Code Bench

Adding an AI provider

Adding a Drift table

Tech Stack

Contributing

Security

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 20

Contributors

Uh oh!

Languages

Manual recovery (`workflow_dispatch`)