Skip to content

mkappworks-dev/code-bench-app

Repository files navigation

Code Bench

Code Bench

Desktop AI coding assistant for local repositories. Bring your own model — Anthropic, OpenAI, Gemini, and Ollama work out of the box, or point it at any OpenAI-compatible custom endpoint. Chat over your repo, run your tools, edit files in place, and watch git state update inline.

License: MIT Platform: macOS Platform: Windows/Linux

Installation (macOS)

  1. Download CodeBench-macos.dmg from the latest release.
  2. Open the DMG and drag Code Bench into Applications.
  3. Launch the app from Applications.

When you point Code Bench at a project under Documents, Downloads, or Desktop, macOS will ask "Code Bench would like to access files in your … folder." Click Allow. Code Bench reads project files from wherever you store them on disk, so it needs access to those user folders.

Features

The app is chat-centric: a single conversation surface with a project sidebar on the left, an inline changes panel that surfaces the agent's edits as they happen, and a top action bar for the active project's branch and PR state. Settings (⌘,) hosts the configuration sub-areas.

Surface Capabilities
Chat Streaming responses · per-session system prompt · model selector · agent loop with tool use · interactive permission prompts · "ask the user a question" cards
Project sidebar Add/relocate local projects · session list per project · session archive · live git state (branch, ahead/behind, dirty-tree) · branch picker
Changes panel Inline diff per edited file · per-change accept/reject · conflict-merge view when on-disk drifts from agent edits · commit dialog · create-PR dialog
Coding tools Built-in tool registry (filesystem read/write, ripgrep, bash, web fetch) · per-tool denylist · ripgrep auto-detect · ready for MCP-server tool sources
MCP servers Configure stdio and HTTP/SSE MCP servers · enable/disable per server · tool inventory surfaces in chat
Integrations GitHub sign-in (Device Flow) · repository browser feeding the project sidebar
Providers Multi-provider key storage (OpenAI · Anthropic · Gemini · Ollama · custom OpenAI-compatible endpoint) · per-provider connectivity test · keys in OS keychain
Settings → Reset "Wipe all data" — clears API keys, GitHub sign-in, chat history, projects, and MCP servers in one step
Auto-update Checks the GitHub Releases endpoint on launch and from Settings · verifies Team-ID match and codesign/spctl on macOS · self-installs and relaunches

Platforms

Platform Status
macOS ✅ Supported — built, signed, notarized, and released in CI on every tag
Windows ⚠️ Unsupported — Flutter build target exists, but no CI, no signing, no released binaries. Use at your own risk and expect to fix things.
Linux ⚠️ Unsupported — same caveats as Windows. Both matrix entries are commented out in .github/workflows/build.yml and .github/workflows/build-and-publish.yml.

iOS, Android, and Web are out of scope.

If you want Windows or Linux to be a supported platform, the path is: re-enable the matrix entries in build.yml and build-and-publish.yml, add platform-appropriate code-signing, fix anything that breaks, and update this section. Until that happens, treat the desktop builds for those targets as a developer-only escape hatch.

Requirements

Dependency Version
Flutter SDK ≥ 3.41.6 stable
Dart SDK ≥ 3.11.4
Xcode (macOS builds) 15+ Sequoia (Xcode CLI tools required)
Windows 10 (Windows builds) 1903+ (in development)
GTK 3 + ninja + cmake (Linux builds) system packages (in development)

1. Clone and fetch packages

git clone git@github.com:mkappworks-dev/code-bench-app.git
cd code-bench-app
flutter pub get

2. Generate code

Drift (SQLite ORM) and Riverpod require a one-time code-generation step. Run this before the first build and whenever you modify database tables or add @riverpod providers:

dart run build_runner build --delete-conflicting-outputs

Use watch mode during active development:

dart run build_runner watch --delete-conflicting-outputs

3. Run

For macos:

flutter run -d macos      # primary dev target

For windows:

flutter run -d windows

For linux:

flutter run -d linux

On first launch, the onboarding screen gates access until at least one AI provider API key is saved.

GitHub sign-in — Code Bench uses the OAuth 2.0 Device Authorization Grant (RFC 8628) on the Benchlabs Codebench GitHub App. The app's client_id is checked into source at ApiConstants.githubClientId and shipped in the binary — that's intentional. Device Flow treats client_id as a non-secret (the same reason there is no client_secret to ship), so embedding it carries no credential-leak risk; see docs/superpowers/specs/2026-05-03-github-app-device-flow-design.md for the full threat model.

Forks must register their own GitHub App and replace ApiConstants.githubClientId. To register: github.com → Settings → Developer settings → GitHub Apps → New GitHub App, tick Enable Device Flow at the bottom, and copy the resulting Iv23li… Client ID over the embedded value.

Project Structure

lib/
├── main.dart                    # Entry point — ProviderScope, window_manager init
├── app.dart                     # MaterialApp.router wired to GoRouter
├── router/
│   └── app_router.dart          # GoRouter: onboarding guard + chat ShellRoute + settings route
├── shell/
│   ├── chat_shell.dart          # Sidebar + chat column + optional changes panel; ⌘N / ⌘, shortcuts
│   ├── notifiers/               # Top-action-bar and status-bar state
│   └── widgets/                 # AppLifecycleObserver, TopActionBar, StatusBar, ActionOutputPanel
├── core/                        # Constants, AppException hierarchy, theme/colors, utils, shared widgets
├── data/
│   ├── _core/                   # Drift AppDatabase, DioFactory, SecureStorage, preferences
│   ├── shared/                  # Cross-cutting models: AIModel, ChatMessage
│   ├── ai/                      # AI datasources (Dio), repository, models/
│   ├── session/                 # Session datasource (Drift), repository, models/ (ChatSession, ToolEvent, …)
│   ├── project/                 # Project datasource (Drift), repository, models/ (Project, WorkspaceProject, …)
│   ├── git/                     # Git datasource (Process), live-state datasource, repository, models/, exceptions
│   ├── github/                  # GitHub datasources (Dio + OAuth), repository, models/
│   ├── apply/                   # Apply datasource (filesystem), repository, security guard
│   ├── filesystem/              # Filesystem datasource (dart:io)
│   ├── bash/                    # Bash datasource (Process) — the one documented `runInShell` exception
│   ├── coding_tools/            # Tool inputs/outputs, denylist, registry-facing types
│   ├── mcp/                     # MCP config datasource (Drift), transport datasources (stdio + HTTP/SSE), repository, models/
│   ├── web_fetch/               # Web-fetch datasource (Dio)
│   ├── providers/               # Provider catalog + ProvidersService backing
│   ├── settings/                # Settings datasource (Drift + SharedPreferences), repository, models/
│   ├── update/                  # Update datasources (Dio for releases, Process for install, IO for sentinel), models/
│   └── integrations/            # Integration metadata (GitHub OAuth)
├── services/
│   ├── ai/                      # AIService — stream buffering, model resolution
│   ├── agent/                   # Agent loop — tool dispatch, permission prompts, iteration cap
│   ├── coding_tools/            # ToolRegistry, denylist service, ripgrep availability probe, individual tools/
│   ├── mcp/                     # MCP service — server lifecycle, tool inventory
│   ├── git/                     # GitService — composite git operations
│   ├── github/                  # GitHubService — OAuth + REST composition
│   ├── session/                 # SessionService — send-and-stream, history, archive
│   ├── project/                 # ProjectService — add/relocate, scan
│   ├── apply/                   # ApplyService — patch orchestration + security guard
│   ├── providers/               # ProvidersService — keychain-backed key storage
│   ├── api_key_test/            # ApiKeyTestService — provider connectivity checks
│   ├── ide/                     # IdeService — editor/terminal launch
│   ├── settings/                # SettingsService — wipe cascade, onboarding
│   └── update/                  # UpdateService — version comparison, codesign/spctl gates, swap-and-relaunch
└── features/
    ├── onboarding/              # First-run wizard (API keys, GitHub sign-in)
    ├── chat/                    # Chat UI, message streaming, agent permission prompts, code-apply actions
    ├── project_sidebar/         # Project list, session list, archive, branch picker triggers
    ├── branch_picker/           # Branch picker dialog + notifier
    ├── archive/                 # Archived sessions screen
    ├── general/                 # Settings → General (preferences, update section, reset section)
    ├── providers/               # Settings → Providers (per-provider keys + test)
    ├── integrations/            # Settings → Integrations (GitHub sign-in)
    ├── coding_tools/            # Settings → Coding Tools (denylist, ripgrep status)
    ├── mcp_servers/             # Settings → MCP Servers (configure, enable/disable)
    ├── update/                  # Update notifier, state, failure types, "Check now" UI
    └── settings/                # Settings shell + sub-area router

Architecture

Dependency rule

The dependency graph is strictly one-directional. Violating it is a build-review blocker:

Widgets / Screens
      ↓  (ref.watch / ref.read notifier)
  Notifiers          ← the only layer widgets may reach
      ↓  (ref.read service)
  Services           ← business logic, composition, typed exceptions
      ↓  (constructor injection)
  Repositories       ← domain interfaces; no I/O
      ↓
  Datasources        ← Dio, DB, Process.run, filesystem live here
      ↓
External (REST APIs / SQLite / OS)

Widgets communicate with notifiers only via ref.watch / ref.read(…notifier).method(). They never reach into a service or repository provider directly. Process.run, dart:io, and Dio are confined to lib/data/**/datasource/.

Command notifiers (*Actions, e.g. ProjectSidebarActions, CodeApplyActions, GitActions) use void build() with keepAlive: true and expose imperative Future<void> methods. They are the bridge between the UI and the service layer.

Naming conventions:

Layer Rule
Service class ends in Service (GitService, SessionService)
Service provider @riverpod function placed before the class it instantiates
Repository interface ends in Repository (GitRepository, AIRepository)
Repository impl + provider class ends in RepositoryImpl; @riverpod before it
Datasource file naming suffix encodes I/O type: *_dio.dart, *_process.dart, *_io.dart, *_drift.dart
Command notifier ends in Actions; void build(), keepAlive: true
State notifier ends in Notifier; owns AsyncValue or value state
Notifier file placement *_notifier.dart, *_actions.dart, and *_failure.dart all live in {feature}/notifiers/

The Riverpod generator strips the Notifier suffix from provider names (ActiveSessionIdNotifieractiveSessionIdProvider). The Actions suffix is kept (GitActionsgitActionsProvider). Widgets must never call ref.invalidate directly — route through a notifier method instead.

Layered architecture

Widgets are pure state-renderers. They call notifier methods and listen for AsyncError state to show snackbars — they never try/catch business-logic calls or import service/repository exception types.

Notifiers mediate all commands. *Actions notifiers extend AsyncNotifier<void>; failures are emitted as AsyncError carrying a typed sealed class {Notifier}Failure. *Notifier classes own reactive AsyncValue<T> data state.

Services own business logic and composition. They receive repositories via constructor injection, convert low-level I/O errors into typed domain exceptions, and expose a clean API to notifiers. Services are instantiated via @riverpod / @Riverpod(keepAlive: true) providers and never constructed directly.

Repositories are domain interfaces (lib/data/**/repository/). Implementations (*RepositoryImpl) are wired up via Riverpod providers and injected into services.

Datasources (lib/data/**/datasource/) are where all I/O lives: Dio HTTP calls, SQLite via Drift, Process.run, and dart:io filesystem access. File suffix encodes the I/O type: *_dio.dart, *_process.dart, *_io.dart, *_drift.dart.

The full rules — naming conventions, error-handling patterns, logging matrix, security guards — are in CLAUDE.md.

State management

Pattern Used for
@Riverpod(keepAlive: true) class Notifier Long-lived app state: active session ID, active project ID, selected model, system prompts, DB, storage
@Riverpod(keepAlive: true) class Actions Imperative commands: *Actions notifiers expose Future<void> methods that mediate widget → service calls (e.g. CodeApplyActions, ProjectSidebarActions, GitActions)
@riverpod class AsyncNotifier Chat messages (loads history, streams new messages)
@riverpod function (StreamProvider) Session list, live git state, MCP server list — wraps Drift / Process stream sources
@riverpod function (FutureProvider) One-shot reads: available model list, package version, last update-check timestamp

Local persistence

All data is stored in a local SQLite database managed by Drift (code_bench.db).

Table Stores
ChatSessions Session ID · title · model/provider · created/updated timestamps · pin flag · archive flag
ChatMessages Message ID · session FK · role · content · extracted code blocks (JSON) · tool events (JSON) · timestamp
WorkspaceProjects Project ID · name · local path · linked repo ID · active branch · associated session IDs
McpServers Server ID · name · transport (stdio / HTTP-SSE) · command + args · env (JSON) · URL · enabled flag

DAOs: SessionDao (sessions + messages CRUD, stream watch) · ProjectDao (projects CRUD) · McpDao (servers CRUD, including deleteAll for the wipe cascade).

Secret storage

SecureStorageSource wraps flutter_secure_storage using a consistent key scheme:

Key Holds
api_key_{provider} API key per AI provider (e.g. api_key_openai)
github_token GitHub OAuth access token
ollama_base_url Custom Ollama server URL
custom_endpoint_url OpenAI-compatible custom endpoint
custom_endpoint_api_key Key for the custom endpoint
Platform Backend
macOS Keychain (first_unlock accessibility)
Windows Windows Credential Manager
Linux libsecret

Building for Distribution

for macos:

flutter build macos --release   # → build/macos/Build/Products/Release/

macOS App Sandbox is intentionally disabled. Code Bench shells out to git, code, cursor, and user-defined action commands, which cannot work under sandbox. See macos/Runner/README.md for the rationale, contributor rules, and distribution implications (Mac App Store eligibility, hardened runtime, notarization).

for windows:

flutter build windows --release # → build/windows/x64/runner/Release/ (unsupported)

for linux:

flutter build linux --release   # → build/linux/x64/release/bundle/ (unsupported)

Releasing (macOS)

Releases are managed by release-please. Every merge to main updates an open release PR that bumps pubspec.yaml, writes CHANGELOG.md, and proposes the next semver version based on conventional commit types (feat: → minor, fix: → patch, feat!: / BREAKING CHANGE: → major). Merging that PR triggers .github/workflows/release-please.yml, which creates a draft GitHub release and chains .github/workflows/build-and-publish.yml to build the macOS app, sign with a Developer ID, notarize through Apple's notary service, staple the ticket, and upload CodeBench-macos.dmg and CodeBench-macos.zip to the draft. The chained workflow then publishes the draft (PATCH draft=false), which is the moment the v* git tag gets created and the release becomes "latest." The in-app auto-updater consumes those artifacts on next launch of older clients.

Required GitHub Actions secrets

Add these under Settings → Secrets and variables → Actions before the first release:

Secret Holds How to get it
MACOS_CERTIFICATE Base64-encoded Developer ID Application certificate (.p12) Export from Keychain Access (right-click identity → Export → .p12), then base64 -i cert.p12 | pbcopy and paste
MACOS_CERTIFICATE_PASSWORD Password set when exporting the .p12 The password you typed at export time
MACOS_PROVISIONING_PROFILE Base64-encoded Developer ID provisioning profile See macos/Runner/README.md for the full Developer Portal walkthrough
APPLE_ID Apple ID email of the notarizing account The email tied to your Apple Developer membership
APPLE_ID_PASSWORD App-specific password — not your Apple ID password appleid.apple.com → Sign-In and Security → App-Specific Passwords → Generate (label e.g. code-bench-notarize)
APPLE_TEAM_ID 10-character Team ID (also used as Xcode DEVELOPMENT_TEAM) developer.apple.com/account → Membership Details
RELEASE_PLEASE_TOKEN Personal access token (classic) with repo scope github.com/settings/tokens → Generate new token (classic) → check repo → no expiry

Why a PAT for release-please? PRs created by the default GITHUB_TOKEN are blocked from triggering other workflows (GitHub's anti-loop protection). Without a PAT, the release PR's required status checks (Analyze & Test, Build (macos)) get stuck on "Expected — Waiting for status to be reported" and never run. A PAT makes the PR appear as user-created so CI fires normally.

Cutting a release

  1. Merge feature/fix PRs to main as normal — use Conventional Commits (feat:, fix:, etc.).
  2. release-please keeps a release PR open that accumulates all pending commits.
  3. When you're ready to ship, merge the release PR.
  4. CI tags, builds, notarizes, and publishes automatically — no manual steps needed.

Never manually bump pubspec.yaml or push v* tags. release-please owns both. Manual bumps or tags will confuse the manifest and produce duplicate or mis-versioned releases.

Manual recovery (workflow_dispatch)

The Release (build + publish) workflow has a Run workflow button under Actions tab → Release (build + publish). Use it for the recovery scenarios below — it's the safety valve when the chained automatic flow fails or you need to operate on an existing release out-of-band. Do not use it for normal releases; merging a Release Please PR is what cuts a release.

Inputs:

Input When to set it
tag Always — the tag string, e.g. v0.2.0
release_id Only when publishing a stuck draft; otherwise blank

Scenario A — stuck draft (no tag yet). release-please.yml ran but the chained run failed mid-way (build crashed, notarization timeout, upload failed). Result: a draft release exists in the Releases tab with a "Draft" badge, but no v* tag was created.

  1. Open the draft release; copy the release ID from the URL (releases/edit/<id>).
  2. Actions tab → Release (build + publish) → Run workflow.
  3. Enter the tag (e.g. v0.2.0) and the release ID → Run.

The workflow re-runs the build, deletes any half-uploaded assets, re-uploads, and flips draft=false to publish — which is what finally creates the git tag.

Scenario B — re-upload assets to a published release. The release shipped, but the DMG was discovered to be corrupted, or you want to attach an additional file. Tag exists, release is non-draft.

  1. Actions tab → Release (build + publish) → Run workflow.
  2. Enter the tag, leave release ID blankRun.

The workflow checks out the tag, builds fresh, and softprops uploads (replacing same-named assets). The changelog is not overwritten.

Scenario C — orphaned tag with no release object. Rare. Someone pushed git tag v0.2.0 && git push origin v0.2.0 outside the Release Please flow. Tag exists, no release object yet.

Same steps as Scenario B (tag only, blank release ID). softprops creates the release for that tag and uploads artifacts.

Scenario D — testing pipeline changes on an existing tag. You modified the build / sign / notarize steps in build-and-publish.yml and want to verify on a real tag without merging a Release Please PR.

Same steps as Scenario B. Be aware: this will replace existing assets on that release — pick a throwaway test tag if you don't want to disturb a real release.

Testing & Linting

flutter test                         # run all tests
flutter analyze                      # static analysis
dart format lib/ test/               # format
dart format --set-exit-if-changed lib/ test/   # CI format check

Extending Code Bench

Adding an AI provider

  1. Add a value to the AIProvider enum in lib/data/shared/ai_model.dart.
  2. Implement the streaming sendMessage path under lib/data/ai/datasource/ (Dio for HTTP, *_dio.dart suffix) and surface it through AIRepository / AIService in lib/services/ai/ai_service.dart.
  3. Add the per-provider key plumbing in lib/data/_core/secure_storage.dart and the corresponding entry in ProvidersService (lib/services/providers/providers_service.dart).
  4. Wire the connectivity test in lib/services/api_key_test/.
  5. Add the row to the Settings → Providers UI under lib/features/providers/.

Adding a Drift table

  1. Define the table class in lib/data/_core/app_database.dart.
  2. Create a @DriftAccessor DAO class in the same file (include a deleteAll method so the table participates in SettingsService.wipeAllData).
  3. Add both to the @DriftDatabase annotation and the daos list.
  4. Increment schemaVersion and add a migration step.
  5. Run dart run build_runner build --delete-conflicting-outputs.
  6. If the table holds user data, add a wipe step to lib/services/settings/settings_service.dart so "Wipe all data" stays exhaustive.

Tech Stack

Layer Technology
UI Flutter · Material Design · Google Fonts
State flutter_riverpod · riverpod_annotation
Navigation go_router (ShellRoute)
Local DB Drift (SQLite via sqlite3_flutter_libs)
Secret storage flutter_secure_storage
HTTP / streaming Dio (SSE via ResponseType.stream)
AI providers OpenAI · Anthropic · Gemini · Ollama · Custom (OpenAI-compatible)
Tool sources Built-in registry (filesystem, ripgrep, bash, web fetch) · MCP (stdio + HTTP/SSE)
Chat rendering flutter_markdown_plus · flutter_highlight
GitHub auth OAuth 2.0 Device Flow (RFC 8628) on a GitHub App — public client_id only
Self-update GitHub Releases API · codesign --verify + spctl --assess · swap-and-relaunch helper
Preferences shared_preferences (NSUserDefaults / equivalents)
Window management window_manager
Serialization freezed · json_annotation
Code generation build_runner · riverpod_generator · drift_dev · freezed · json_serializable

Contributing

Contributions are welcome. Please read CONTRIBUTING.md before opening a PR.

Security

To report a vulnerability, see SECURITY.md.

License

MIT — free to use, modify, and distribute.

About

Desktop AI coding assistant — bring your own model, chat over your local repo, run your tools, edit in place, see git inline. macOS, Windows, Linux.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Contributors

Languages