Add AppKit support and databricks-agent-skills integration#356
Conversation
calreynolds
left a comment
There was a problem hiding this comment.
Looking for confirm on the comment! Otherwise skill content looks good 👍
| APX_RAW_URL="https://raw.githubusercontent.com/databricks-solutions/apx/main/skills/apx" | ||
|
|
||
| # Agent skills (fetched from databricks/databricks-agent-skills repo) | ||
| AGENT_SKILLS="databricks databricks-apps databricks-jobs:databricks-jobs-bundles databricks-lakebase databricks-pipelines" |
There was a problem hiding this comment.
can we only install databricks-apps for now? I think we need to go through process of reviewing jobs, lakebase, and pipelines still.
There was a problem hiding this comment.
For good codegen, we need databricks to manage the CLI & do basic data discovery and datbricks-lakebase to provision lakebase if required.
so I could leave out jobs and pipelines -- but I think that it would be better to fully unify
There was a problem hiding this comment.
@jamesbroadhead does appkit rely on those other skills? or is it standalone? for now I think it would be great to scope down to just appkit if it can work standalone since we have functional lakebase skills + tools that will likely work with the new app stuff
There was a problem hiding this comment.
skills: yes
but I'll drop the other two - we can always add them at some point later
There was a problem hiding this comment.
So is the intention that this will try pointing people to use the other skill databricks-apps which comes from databricks-agent-skills repo? I'm ok with that as long as in testing we see it will proceed to use a Python framework if user does not want a Javascript backend.
There was a problem hiding this comment.
Yes - and sounds good on testing. What kind of testing did you have in mind?
There was a problem hiding this comment.
@jamesbroadhead for this PR this can be an ad-hoc test that we see the results of! We also have a custom .test framework in this repo if you want to attach Claude to to run 👍
There was a problem hiding this comment.
(Claude posting on James's behalf — James reviewed the test plan and the cleanup before I posted.)
Status: Merged origin/main into the branch (one install.sh conflict resolved — kept main's commit a467a14 fix that stops --skills auto-including CORE_SKILLS, layered this PR's AGENT_SKILLS bucket on top). Also pushed a cleanup commit that completes the pre-existing-bug fix called out in the PR description: renamed every remaining databricks-app-python (singular) reference to databricks-apps-python — .test/skills/, .test/baselines/, _routing/ground_truth.yaml (12 entries), routing.py, test_scorers.py (16 asserts), run_app_eval.py docs, builder-app system_prompt.py / skills_manager.py / DocPage.tsx, MCP apps.py docstring, install_skills.sh, and related-skill links in databricks-bundles + databricks-lakebase-provisioned. .test/tests/test_scorers.py is 36/36 green after the rename.
Tests run (ad-hoc, per Cal's suggestion):
Test A — Dustin's specific concern. Prompt: "Build a Databricks app showing customer orders from SQL warehouse with a filter. Python-only team, no JS/TS."
PASS. Sub-agent picked Streamlit, produced correct app.py + app.yaml (Config(), @st.cache_resource, command: ["streamlit", "run", "app.py"]). It cited line 57 ("Use Python when: the team is Python-only…") and line 132 ("Fall back to Python Framework Selection only if Python is required") as the deciding lines. Zero AppKit pushback once "Python-only" was stated.
Test B — sanity check, explicit Streamlit ask without "no JS" framing. Prompt: "I need to build a quick Streamlit dashboard for our ML model's predictions, with a SQL warehouse backend."
Works, with friction. Streamlit is reachable but a conscientious agent feels obligated to ask "are you sure you don't want AppKit?" first. Three friction points the sub-agent flagged:
- Line 55: Python labeled "alternative" — small but noticeable sting when the user already named Streamlit.
- Line 132: Workflow step 1 routes "new app from scratch" → AppKit with no branch for "user already named a Python framework".
- Lines 71–77: Required Steps checklist starts at "Framework selected" even when the user named one.
Proposed one-liner for Workflow step 1: "If the user names a specific Python framework, skip the AppKit recommendation and proceed." Happy to add to this PR if you want, or punt to a follow-up.
.test framework run: Got past SKILL.md loading on the renamed skill; preflight LLM call fails on my token (auth, not framework). The unit test sweep (pytest .test/tests/test_scorers.py) is the meaningful check, and that's 36/36.
Remove databricks-jobs:databricks-jobs-bundles and databricks-pipelines from agent skills in both install.sh and install.ps1. Co-authored-by: Isaac
# Conflicts: # install.sh
Completes the pre-existing-bug fix called out in the PR description. The skill directory `databricks-apps-python` (plural) is the canonical name; this updates all remaining stale singular references (.test eval skill, routing ground truth, scorer, baselines, run_app_eval docs, builder-app system prompt + UI, MCP tool docstring, install_skills.sh, related-skill links in lakebase-provisioned/bundles SKILL.md). Verified .test/tests/test_scorers.py: 36/36 pass. Co-authored-by: Isaac
Two issues caught in parallel review by GPT 5.4 and Gemini 3.1 Pro: 1. **Upstream skill name mismatch** (HIGH, GPT 5.4): The agent skill at `databricks/databricks-agent-skills` is named `databricks-core`, not `databricks`. Previously `--skills-profile app-developer` would silently fail to install one of three advertised skills. Fix: use the source:install-name syntax this PR already introduces — `databricks-core:databricks` keeps the local install name while pointing at the correct upstream directory. 2. **Partial-download false success** (CRITICAL, Gemini 3.1 Pro): `ok_flag` was initialized to 0 and flipped to 1 on the first successful file download, never reset on subsequent failures. Any single successful file (even an examples/*.py) marked the skill as installed, leaving a broken skill on disk. Same pattern in install.ps1 with $okFlag. Fix: initialize the flag to 1 (true) and only flip to 0 (false) on any single download failure — a skill installs cleanly iff every file lands. Not addressed (acceptable today, worth a follow-up): - Unauthenticated GitHub API rate limit (60/hr) — fine for single- user installs, can bite CI. - `grep '\.'` heuristic to filter trees vs blobs — brittle but works for current skills (all files have extensions). - Sequential vs parallel curl — perf nit, not a bug. Co-authored-by: Isaac
|
(not sure I'm following up with this PR - hold off on review for the moment) |
|
^ ok, will pursue merging this - seeking re-review please |
GPT 5.4 xhigh and Gemini 3.1 Pro both flagged the new short agent-skill names triggering latent bugs in the install scripts. Three fixes: 1. install.sh `_is_preselected`: `grep -qw` treats `-` as a word boundary, so checking for `databricks` would falsely match `databricks-jobs`, `databricks-apps`, etc. Strip `source:` prefix from each preselected entry and use `grep -Fxq` for exact whole-line equality. Same fix applied to the deselection cleanup at the cleanup loop. 2. install.ps1: `$preselected -contains "databricks"` is exact equality on array elements, so it never matched the `"databricks-core:databricks"` entry seeded by the app-developer profile — the Agent: Databricks checkbox was never auto-preselected. Normalize `$preselected` by stripping `source:` prefixes once before the menu is built. 3. install.sh/ps1: the "Agent skills (N) -> ..." success line ran unconditionally, even when the GitHub tree fetch failed or every per-skill download warned and was rolled back. Track an `agent_success`/`$agentSuccess` counter and only print the success line when all selected agent skills installed; print a warning when only some succeeded. Co-authored-by: Isaac
…ale dirs Two more ACE review findings, both real: 1. The path-extraction regex `"path":"skills/..."` (no space) does NOT match the GitHub tree API's actual response, which is pretty-printed as `"path": "skills/..."` (space after colon). Result: the installer would warn "Could not fetch agent skill" for every entry and install nothing. Fix: collapse the JSON whitespace before regex extraction, and switch from the `grep '\.'` heuristic to matching the adjacent `"type": "blob"` field so directory entries are correctly skipped. 2. Reinstalls reused the existing $dest_dir / $destDir without clearing it, so files removed upstream would persist locally across upgrades. Fix: `rm -rf` / `Remove-Item -Recurse` the destination before each skill is downloaded. Co-authored-by: Isaac
ACE iteration 2 findings: - Gemini caught that the `grep -qw` word-boundary bug I fixed in `_is_preselected` still bit the resolve_skills() bucketing chain (lines 918/920/922). Passing `--skills databricks` (a valid agent install-name) would match `databricks-app-apx` via the hyphen word-boundary on the APX branch and get misclassified. Switched all three buckets to exact match via `tr ' ' '\n' | grep -Fxq`, and rewrote the source:install-name lookup on line 923 with `-E` and a clean anchored alternation. - GPT caught that when the GitHub tree fetch fails entirely (`agent_tree` empty / `$agentTree` null), the summary branch emits nothing — neither OK nor "only N of M installed" — even though zero of N agent skills were actually installed. Added an `else` branch that emits a `0 of $agent_count installed` warning. Out of scope (flagged but not addressed): - GPT raised legacy-name compatibility for `databricks-app-python` → `databricks-apps-python`. The rename is intentional per the product decision; back-compat aliasing is a separate concern. Co-authored-by: Isaac
…tal) (#533) * Add AppKit support and databricks-agent-skills integration Equivalent of #356 (which targets main), adapted for the experimental branch: - Rename `databricks-app-python` → `databricks-apps-python` (plural) for the bundled skill directory, baselines, manifests, builder-app refs, install scripts, and cross-skill mentions. - `databricks-apps-python/SKILL.md` now leads with AppKit (TypeScript + React SDK) as the recommended approach for new apps, with Python frameworks (Dash, Streamlit, Gradio, Flask, FastAPI, Reflex) demoted to an explicit alternative. Frontmatter `name` and H1 title updated to match. - `install.sh` and `install.ps1` fetch and install skills from `databricks/databricks-agent-skills` (`databricks`, `databricks-apps`, `databricks-lakebase`) via a single GitHub API tree call. New `AGENT_SKILLS` variable supports `source:install-name` syntax (e.g. `databricks-core:databricks`) so the install directory can differ from the upstream skill path. - Preserves experimental's polish: keeps `6-cli-approach.md` (not MCP) and references `databricks-lakebase-autoscale` (since `databricks-lakebase-provisioned` was removed on experimental). Co-authored-by: Isaac * fix: address ACE multi-model review findings GPT 5.4 xhigh and Gemini 3.1 Pro both flagged the new short agent-skill names triggering latent bugs in the install scripts. Three fixes: 1. install.sh `_is_preselected`: `grep -qw` treats `-` as a word boundary, so checking for `databricks` would falsely match `databricks-jobs`, `databricks-apps`, etc. Strip `source:` prefix from each preselected entry and use `grep -Fxq` for exact whole-line equality. Same fix applied to the deselection cleanup at the cleanup loop. 2. install.ps1: `$preselected -contains "databricks"` is exact equality on array elements, so it never matched the `"databricks-core:databricks"` entry seeded by the app-developer profile — the Agent: Databricks checkbox was never auto-preselected. Normalize `$preselected` by stripping `source:` prefixes once before the menu is built. 3. install.sh/ps1: the "Agent skills (N) -> ..." success line ran unconditionally, even when the GitHub tree fetch failed or every per-skill download warned and was rolled back. Track an `agent_success`/`$agentSuccess` counter and only print the success line when all selected agent skills installed; print a warning when only some succeeded. Co-authored-by: Isaac * fix: agent-skills installer regex matches live GitHub API + clears stale dirs Two more ACE review findings, both real: 1. The path-extraction regex `"path":"skills/..."` (no space) does NOT match the GitHub tree API's actual response, which is pretty-printed as `"path": "skills/..."` (space after colon). Result: the installer would warn "Could not fetch agent skill" for every entry and install nothing. Fix: collapse the JSON whitespace before regex extraction, and switch from the `grep '\.'` heuristic to matching the adjacent `"type": "blob"` field so directory entries are correctly skipped. 2. Reinstalls reused the existing $dest_dir / $destDir without clearing it, so files removed upstream would persist locally across upgrades. Fix: `rm -rf` / `Remove-Item -Recurse` the destination before each skill is downloaded. Co-authored-by: Isaac * fix: resolve_skills bucketing + 0-of-N agent install summary ACE iteration 2 findings: - Gemini caught that the `grep -qw` word-boundary bug I fixed in `_is_preselected` still bit the resolve_skills() bucketing chain (lines 918/920/922). Passing `--skills databricks` (a valid agent install-name) would match `databricks-app-apx` via the hyphen word-boundary on the APX branch and get misclassified. Switched all three buckets to exact match via `tr ' ' '\n' | grep -Fxq`, and rewrote the source:install-name lookup on line 923 with `-E` and a clean anchored alternation. - GPT caught that when the GitHub tree fetch fails entirely (`agent_tree` empty / `$agentTree` null), the summary branch emits nothing — neither OK nor "only N of M installed" — even though zero of N agent skills were actually installed. Added an `else` branch that emits a `0 of $agent_count installed` warning. Out of scope (flagged but not addressed): - GPT raised legacy-name compatibility for `databricks-app-python` → `databricks-apps-python`. The rename is intentional per the product decision; back-compat aliasing is a separate concern. Co-authored-by: Isaac
Summary
databricks-apps-python/SKILL.md(formerlydatabricks-apps/) now leads with AppKit (TypeScript + React SDK) as the recommended approach for new apps, with Python frameworks (Dash, Streamlit, etc.) demoted to an explicit alternative. Frontmatternameand H1 title updated to match.databricks-skills/databricks-apps→databricks-skills/databricks-apps-pythonto avoid collision with the incoming agent skill of the same name.databricks/databricks-agent-skills: Bothinstall.shandinstall.ps1now fetch and install skills from databricks/databricks-agent-skills (databricks,databricks-apps,databricks-jobs→ installed asdatabricks-jobs-bundles,databricks-lakebase,databricks-pipelines). Uses a single GitHub API tree call per install to fetch all files recursively, including nestedreferences/subdirectories.AGENT_SKILLSsupportssource:install-nameentries (e.g.databricks-jobs:databricks-jobs-bundles) to decouple the upstream repo path from the local install directory name — no changes needed in the upstream repo.databricks-app-pythonwas listed inSKILLS/profiles but the directory was alwaysdatabricks-apps— now consistent asdatabricks-apps-python.Test plan
bash install.sh --list-skillsshows agent skills with install names (notsource:install-nameraw entries)bash install.sh --skills-profile app-developerinstallsdatabricks-apps-python,databricks-app-apx, and agent skillsdatabricks,databricks-apps,databricks-lakebase,databricks-pipelinesbash install.sh --skills databricks-jobs-bundlescorrectly fetches and installs fromskills/databricks-jobs/in the agent-skills repo into a directory nameddatabricks-jobs-bundlesbash install.sh --skills-profile allinstalls all 39 skills with no directory conflictsinstall.ps1on WindowsThis pull request was AI-assisted by Isaac.