Skip to content

release: develop → main (v0.9.0 — API-conventions overhaul + agent-API features)#755

Merged
chronoai-shining merged 232 commits into
mainfrom
develop
May 29, 2026
Merged

release: develop → main (v0.9.0 — API-conventions overhaul + agent-API features)#755
chronoai-shining merged 232 commits into
mainfrom
develop

Conversation

@chronoai-shining
Copy link
Copy Markdown
Collaborator

Minor release. ~97 changesets have accumulated on develop since v0.8.0; highest fixed-pair bump is minor, so both packages go 0.8.0 → 0.9.0 in fixed mode.

After merge, the changeset-release workflow will:

  1. Open release/v0.9.0 PR consuming the pending changesets and bumping both packages 0.8.0 → 0.9.0.
  2. On bump-PR merge: tag v0.9.0, publish a GitHub Release sourced from .github/release-notes-20260529.md, then open a sync PR back to develop (merge with Create a merge commit, never squash — per CLAUDE.md).

What ships

Breaking (5) — all pre-1.0 minor bumps per convention:

  • Error responses now RFC 7807 application/problem+json.
  • Error code values now lowercase_snake_case.
  • PATCH/DELETE /skills/:id/versions/:version accept the GUID only, not the name.
  • Deprecation signaled via RFC 8594 headers (replacing custom X-Skill-Deprecated).
  • Legacy ownerId field removed from skill responses.

New Feature

  • Cursor pagination on /skill-search + SDK auto-pagination iterator.
  • Dist-tags for skill versions (latest / stable channels).
  • Idempotency-Key header for retry-safe state-changing requests.
  • Sliding-window rate limiting + RFC 9239 headers.
  • Published JSON Schema for SKILL.md frontmatter.
  • Subresource-Integrity hashes on version manifests.
  • Version-pinned install prompt; one-command docker-compose dev + example skills.

Changed

  • Canonical ?q= search param (legacy ?query= still works); 201 + Location on creating POSTs; skip-validation extends to frontmatter; chat composer length cap; backend zip-bomb caps; prod console silencing.

Fixed

  • Tab-return logout, CORS on save/login, package-edit 404, chat-completion (DeepSeek) providers, LLM-provider save, playground sandbox/session + friendly errors, quota races, inline create validation, actionable frontmatter/settings errors, notification sync/scroll, deactivated-service filtering, stale-share flagging, audit-rerun indicator, admin papercuts, private-skill JSON leak + playground env-value leak (security), Chinese localization sweep.

Issues closed

Test plan

  • Bot opens release/v0.9.0 PR — squash-merge it.
  • Tag v0.9.0 + Release publish; sync PR opens — merge with a merge commit (NOT squash) per CLAUDE.md.
  • Verify GitHub Release body uses .github/release-notes-20260529.md content (not the short fallback).

chore: sync main → develop after v0.8.0
#560) (#561)

Re-syncs SKILL.md + references/api-reference.md against the current
/api/v1/* surface on develop. No ornn-api / ornn-web code change — this
is a registry-published skill manual bump only. Agents pick up the new
contract by pulling GET /api/v1/skills/ornn-agent-manual-cli/json after
the registry-side GitHub sync runs.

SKILL.md
- frontmatter: version "1.1" -> "1.2", lastUpdated 2026-05-14
- replaced the now-expired "this file will be removed" banner with a
  positive Scope framing (CLI variant is the smaller-footprint companion
  to chrono-ai-service-manual)
- §0.5 versions response: added the missing { data: ... } envelope
- §1.3 permission table: removed defunct ornn:admin:category row, added
  ornn:quota:admin for the real quota-admin surface
- §2.13 notifications recipe: documented the discriminated
  source: "user" | "broadcast" feed shape that landed with broadcasts
- §2.15 (new): recipe for /me/quota + /me/models?surface=... so an
  agent can resolve a valid modelId and short-circuit on remaining: 0
  before any SSE call; mentions /me/redemption-codes/redeem as the
  top-up path
- §4 references: added pointers for /me/quota, /me/models, and
  /announcements/active

references/api-reference.md
- §1.5 permission catalogue: dropped ornn:admin:category, added
  ornn:quota:admin
- §1.7 HTTP status mapping: added 410 (gone) and 429 (QUOTA_EXCEEDED);
  fixed the 201 trigger examples to point at the real creating endpoints
  (announcement / broadcast / redemption-code create)
- §1.8 error legend: added INVALID_SURFACE, QUOTA_EXCEEDED, the three
  MODEL_* codes, NOTIFICATION_NOT_FOUND, and the four REDEMPTION_CODE_*
  codes with their real status codes
- §9.1 notifications: rewrote the response shape to show both variants
  of the discriminated union (user + bilingual broadcast)
- §9.3: renamed NOT_FOUND -> NOTIFICATION_NOT_FOUND and clarified that
  the handler routes by id-lookup across both collections
- §9.4: clarified that `updated` is a combined count across per-user
  notifications + broadcast read-receipts
- §11 me: added §11.9 /me/quota, §11.10 /me/models, §11.11
  /me/redemption-codes/redeem, §11.12 /me/redemption-codes/history;
  bumped the header endpoint count from 8 -> 12
- §13 admin: full rewrite. Removed six ghost sections (/admin/stats,
  /admin/activities, four /admin/categories/* routes, three
  /admin/tags/* routes). Replaced with compact tables for the real
  surface: /admin/dashboard/stats, /admin/users, admin skill browse
  + delete, AgentSeal rescan, /admin/quota/*, /admin/redemption-codes/*,
  /admin/mirror/{reconcile,status} with the correct 202/409/503 contract,
  announcements (public + admin), broadcasts
- §14 platform settings: full rewrite covering the four real route
  groups — legacy single-document /admin/settings, sectioned
  /admin/settings/{section} with the seven real publicPath values
  (playground, skill-generation, mirror, integrations/nyxid, skill-audit,
  posthog, extras), the seven LLM-provider routes including the
  per-model surface-flags PATCH, and export/import
- appendix: corrected the registry skill name from ornn-agent-manual
  to ornn-agent-manual-cli

.changeset/ornn-agent-manual-cli-v1-2.md
- empty changeset (no semver bump) per CLAUDE.md guidance for
  docs-only PRs; satisfies check-changeset.yml gate

Closes #560
Add a centered line — "Ornn official website — ornn.chrono-ai.fun" —
right under the tagline and above the section nav in the root README.

The README already linked to `ornn.chrono-ai.fun/docs` in two places,
but the bare homepage was nowhere to be found. First-time readers
landing on the GitHub repo had no one-click path to the product
surface; this puts the website at parity with the badges and the
docs link without restructuring the rest of the file.

Docs-only change. Empty changeset attached to satisfy the gate.
#468 — sets keywords aligned to agent-API positioning (ornn,
agent, skill-registry, skill-lifecycle, ai-agents,
model-agnostic, mcp, llm, sdk) and the four npm publish fields
that show up on the package page (homepage, bugs, repository,
license).

private:true is preserved here; npm publish strategy is the
separate scope of #473.
#468 — replaces the legacy keyword set (skills, ai, agents, nyxid)
with the same vocabulary used across the TS SDK so PyPI search
surfaces the same intent: ornn, agent, skill-registry,
skill-lifecycle, ai-agents, model-agnostic, mcp, llm, sdk.
Inserts a new `## SDK quickstart` section between `## How it works`
and the existing agent-side `## Quickstart`, with copy-paste TS +
Python snippets that match the real OrnnClient constructor (baseUrl
+ token) and .search({ q }) shape. Top nav gets a new anchor.

The npm install line is aspirational while #473 (TS SDK private:true,
publish strategy) is unresolved — that caveat is called out inline so
a reader hitting the npm 404 isn't surprised. The Python line works
because ornn-sdk is reserved on PyPI.

Closes #470.
chore(sdk): discovery keywords + npm publish fields (#468)
Inserts a "How Ornn compares" section after `## Quickstart` with
a feature matrix against MCP servers, Smithery, and npm registry,
followed by short prose on what each comparison actually means.

Owns the CLI gap honestly via a footnote pointing at the Phase 2
roadmap, instead of marking ✓ on a CLI we don't ship today. This
addresses the strategic-positioning callout in the issue: trim the
marketing to match what's actually built today, not what's planned.

Closes #472.
Specifies what currently has no public contract:

- Alpha caveat: 0.x line makes no compatibility guarantees;
  pinning exact versions is the only insulator during alpha.
- Post-v1 semver policy (patch / minor / major) and what each
  bump is allowed to change.
- Three stability tiers (stable, beta, experimental) declared
  in OpenAPI via x-stability. Beta + experimental require the
  Accept-Stability request header so SDK clients don't reach
  them by accident.
- RFC 8594 deprecation policy: ≥ 2 minor releases of lead time
  + ≥ 90 days for SDK-affecting deprecations, with synchronized
  signals on response headers, CHANGELOG, DEPRECATIONS.md, and
  GitHub release notes.
- Breaking-change PR checklist.

Adopting these tiers in OpenAPI + SDK is a follow-up; this doc
is the contract those follow-ups conform to.
README "Documentation" list adds a row for the new doc.

CONVENTIONS.md §7 (Deprecation) gets a one-line lede pointing
at API_STABILITY.md for the stability commitment + lead-time
policy — keeps CONVENTIONS focused on header shape and avoids
duplicating the same policy in two places.

Closes #474.
CONVENTIONS.md §1.6 / §7 reference these two files but neither
existed — every error response's RFC 7807 `type` URL was a 404,
and there was no public log for deprecation Link headers to anchor
to.

docs/ERRORS.md
  - One `##` heading per target lowercase_snake_case code from
    §1.4 so anchors resolve (`#validation_error`,
    `#authentication_required`, …).
  - Each section documents: HTTP status, current pre-#585 codes
    that map here, a sample response, and recommended client
    action.
  - Appendix maps every SCREAMING_SNAKE code emitted today by
    the server to its lowercase target — exhaustive at time of
    writing, owned by #585 (case migration) when it lands.

docs/DEPRECATIONS.md
  - Empty active section while in alpha (per API_STABILITY.md
    breaking changes go straight into the next minor, no
    deprecation cycle until v1).
  - Entry template + sample headers ready for the first
    deprecation.
…576)

Three URL renames in CONVENTIONS.md so the sample responses and
deprecation header in the doc match the actual on-disk filenames
(uppercase ERRORS.md / DEPRECATIONS.md, per #576):

- §1.3 sample error response `type` field
- §1.6 `type` URL template + matching catalog reference
- §7 sample `Link: rel="deprecation"`

Closes #576.
docs(readme): add SDK quickstart section (#470)
domains/skills/crud/repositories/ was a 218-line SkillRepository
implementation + its interface + its colocated test, none of which
were imported by any route, service, or test outside the directory.
Self-contained dead code from a refactor that never finished.

The live skill repository (915 lines) at
domains/skills/crud/repository.ts is untouched — every existing
import already points there.

Closes #577.
docs(readme): add positioning / comparison table (#472)
Number(process.env.X ?? "200") silently returns NaN on garbage
input (non-numeric, trailing letters, fractions, negatives,
empty string), and the migration then writes NaN into the
quota_buckets documents. Quota math downstream uses these
defaults; once they're NaN, every comparison goes silently wrong.

Replaces the two call sites with parseNonNegativeInt(name, fallback)
that:
  - trims surrounding whitespace
  - rejects anything that doesn't match /^[0-9]+$/ with an error
    that names the env var and shows the offending value
  - returns a non-negative integer

Migration runs rarely and on prod, so failing loudly at startup
beats silently corrupting documents that nothing else validates.
Covers the failure modes Number(env) silently swallowed:
  - unset env → fallback wins
  - whitespace-padded valid value
  - explicit zero
  - non-numeric (abc)
  - trailing garbage (200abc) — the silent-truncation case
  - negatives, fractions, empty string

Each rejection path asserts the thrown error message contains
"must be a non-negative integer" so the helper's contract is
load-bearing in CI.
docs: add API_STABILITY.md commitment + deprecation policy (#474)
Pure module that validates the shape of a PEM-encoded GitHub App
private key before it lands in mirror settings. Checks performed
in order:

  - typeof string + non-empty
  - byte cap (8 KB — real RSA 4096 PEMs are ~3.2 KB)
  - C0 control bytes rejected (allow tab/CR/LF only)
  - CRLF normalised to LF + outer whitespace trimmed
  - BEGIN/END line markers (PKCS#1 or PKCS#8)
  - base64-only body
  - crypto.createPrivateKey round-trip — catches keys that pass
    every shape check but are corrupted, truncated, or for a
    different algorithm

Returns a discriminated { ok, value | reason } result instead of
throwing, so the calling route picks its own error code.
12 tests cover the contract. Happy paths: PKCS#1 RSA-2048,
PKCS#8, CRLF normalisation, whitespace stripping. Rejections:
non-string, empty, oversize, embedded NUL, missing BEGIN,
missing END, non-base64 body, shape-passing-but-truncated.

Uses generateKeyPairSync so the test is hermetic; never reads
a fixture file or hits the network.
The README still meandered through nine H2 sections before a new
reader saw how to actually onboard. Restructure to a focused 5-section
front door:

1. **What is Ornn + Why we built it** — keep the agent-facing
   skill-lifecycle API framing, add a short "Why" that names the gap
   (no shared registry, model-locked alternatives, no lifecycle).
2. **How it works** — replace the ASCII art with a real Mermaid
   flowchart (GitHub renders ``` ```mermaid ``` natively). Splits
   "Your machine" (agent + nyxid CLI + skill runtime) from
   "Ornn cloud" (ornn-api + NyxID + registry + sandbox) and shows the
   call flow in one glance.
3. **Quickstart** with three numbered, copy-paste-able sub-steps:
   - Sign up for NyxID at `nyx.chrono-ai.fun` with invite code
     `NYX-2XXJI08A` — SSO via GitHub / Google / Apple.
   - Install `ornn-agent-manual-cli` into the agent runtime; the agent
     will prompt for `nyxid` mid-setup.
   - Talk to the agent — four concrete plain-language example prompts
     for search / pull+install / audit / build+publish.
4. **Community and Contributing** — merged into one section with
   Roadmap folded in as a bullet (Issues + Milestones + Releases).
5. **License**.

Removes (links live on the website, in CONTRIBUTING.md, or as bullets
in Community):

- `## Run Ornn locally (5 minutes)` — Docker compose recipe is
  contributor onboarding, not a front-door section.
- `## How Ornn compares` — positioning belongs on
  ornn.chrono-ai.fun, not the README.
- `## Examples` — discoverable from the website + Quickstart prompts.
- `## Documentation` index — internal docs are reachable from
  CONTRIBUTING.md.
- `## Roadmap` — collapsed to a single Community bullet.
- Top-of-page nav strip — overkill for a 5-section README.

Header (just polished in #702) is untouched. Docs-only; no package
code or behaviour changes.
Empty (docs-only) changeset for the README 5-section restructure. No
package version bump.
docs(readme): restructure to tight 5-section layout with mermaid + concrete onboarding (#703)
…te (#705)

The diagram added in #703 rendered with Mermaid's default lavender +
light-yellow subgraph styling — functional, but exactly the generic
SaaS-template visual neighborhood DESIGN.md's Differentiation
Guardrails tell us to avoid.

Restyle to the dark forge palette so it matches the dark-only hero
(#702) and speaks the same brand language as the website. Locked to
dark in both GitHub themes per DESIGN.md's explicit carve-out for
operational / code-chrome surfaces:

> "Some surfaces may remain dark in both themes when the metaphor
>  requires operational contrast: terminal and code-chrome surfaces,
>  install or command surfaces, selective contrast islands where high
>  signal-to-noise matters."

A system architecture diagram is exactly that kind of operational
surface.

Token map (Mermaid `themeVariables` + per-node `classDef`):

- Canvas / "ORNN CLOUD" subgraph: `obsidian` (#0B0907 / #08070A)
- "YOUR MACHINE" subgraph: `graphite` (#14110B) — slightly lighter
  so the two subgraphs read as different materials without breaking
  the single-ember rule
- Forged-metal node fill: `color-surface-card` (#1A1610)
- Storage node: `iron` (#221E16) for the cylinder
- Borders: `steel` (#3A3328)
- Strong text on dark nodes: `parchment` (#F1ECDE)
- Edges + body text: `bone` (#C9BFAD)
- **Ember anchor on `ornn-api`**: `color-accent-primary` (#FF7322)
  with `ember-dim` (#C9460D) border — the brand action voice, single
  ember anchor per DESIGN.md's "restrained heat" principle.
- **Arc-blue on `NyxID`**: `color-accent-secondary` (#5BC8E8) with
  `arc-dim` (#3A8FB8) border — DESIGN.md's restricted secondary
  diagrammatic role (auth / identity = "cool side of the forge").

Subgraph titles render UPPERCASE per DESIGN.md display direction.
Hints at `JetBrains Mono` via `fontFamily` — GitHub Mermaid will fall
back to its sandboxed mono default but the hint is harmless.

Diagram structure and edges are unchanged from #703 — purely a
styling pass.
Empty (docs-only) changeset for the README mermaid restyle. No
package version bump.
…style

docs(readme): restyle How-it-works mermaid with Editorial Forge palette (#705)
…bel bg, material contrast (#707)

The #705 forge-palette restyle got the colors right but four issues
surfaced in the rendered diagram:

1. Subgraph titles "Your machine" and "Ornn cloud (ornn.chrono-ai.fun)"
   got clipped to "YOUR MACHIN" and "ORNN CLOUD · ornn.chrono-ai.fu"
   because the computed subgraph width was narrower than the title.
   → Switch to the DESIGN.md bracketed mono pattern
     `[ § YOUR MACHINE ]` / `[ § ORNN CLOUD ]`, drop the URL suffix
     (it's redundant with the README prose around the diagram).

2. Edge labels rendered as dark-on-dark rectangles. Mermaid defaults
   `edgeLabelBackground` to white; on a dark canvas every label
   ("invokes", "HTTPS", "verify token", …) carried a visible dark
   backdrop that looked like a rendering artifact.
   → Set `edgeLabelBackground: "#0B0907"` (canvas obsidian). Labels
     now float on the canvas with no backdrop.

3. Both subgraph fills were near-black (graphite + near-obsidian) so
   the two regions read as one big black blob — no material contrast.
   → Lift `[ § YOUR MACHINE ]` to `iron` (#221E16) and keep
     `[ § ORNN CLOUD ]` at `graphite` (#14110B). Gentle material
     distinction without breaking the single-ember rule. Also set
     `clusterBkg` / `clusterBorder` theme defaults so the per-subgraph
     `style` overrides are an intentional choice, not a fallback.

4. Every node had a parenthetical subtitle (`AI agent (any model)`,
   `Sandbox (remote execution)`, …) which widened each node and
   pushed the rightmost column off-screen — `Sandbox` partially
   disappeared behind GitHub's mermaid pan-zoom UI.
   → Trim node labels to one line. The README prose already covers
     "any model" / "local runtime" / "remote execution". Tighten edge
     labels too: `read / write` → `r/w`, `skill artifact` → `artifact`,
     etc.

Bonus refinements that fall out of the same pass:

- `CLI ==>|HTTPS| API` is now thick + ember-tinted via `linkStyle 1`.
  This is the single load-bearing action edge of the diagram and
  DESIGN.md explicitly allows "localized glow / wire / pulse effects
  as accents rather than baseline decoration" — one accent edge fits.
- Node strokes bumped to 1.5–2px for the "forged metal" feel.
- Subgraph title text switches from `bone` (body) to `parchment`
  (strong) — they're headings, not body copy.
- Default `lineColor` drops from `bone` (#C9BFAD) to `ash` (#7E776B)
  so the default edges recede and the ember-tinted HTTPS edge plus
  the ember/arc node accents win the visual weight contest.

Diagram structure (nodes, edges, directions) unchanged from #705.
Purely a styling + labelling polish.
Empty (docs-only) changeset for the README mermaid polish. No
package version bump.
…lish

docs(readme): polish How-it-works mermaid — bracketed titles, edge-label bg, material contrast (#707)
…ema (#649) (#710)

PR #672 added the `error: (issue) => …` actionable callbacks to the
backend canonical schema (`ornn-api/src/shared/schemas/skillFrontmatter.ts`)
but the SPA's separate pre-upload validator at
`ornn-web/src/utils/skillFrontmatterSchema.ts` still used bare
`z.string()`. The frontend gate fires first, so a SKILL.md with
`version: 0.1` (unquoted, YAML → number) still showed
`version: Invalid input` — the backend's friendly copy never
reached the user.

Mirror the same `invalid_type` callbacks on the frontend for
`version`, `tag`, `runtime-env-var`, `tool-list`, `runtime`, and
`runtime-dependency`. Six new i18n keys land in both en.json and
zh.json so Chinese users see localized copy too.

New unit test `skillFrontmatterSchema.test.ts` pins each branch and
guards against regressions on the existing `*Format` paths (which
fire for wrong-shape strings, not wrong-type values).

End-to-end QA verification 2026-05-22 confirmed the previous failure
mode (uploading a ZIP with `version: 0.1` shows `Invalid input`);
this commit fixes it.

Closes #649
)

PR #586 tightened backend `PUT /skills/:id` and `DELETE /skills/:id`
to `findByGuid` only — no `findByName` fallback. The SPA's owner-edit
page route stays human-readable (`/skills/:name/edit`), but the page
was passing the URL `:id` (the skill name) straight into
`useUpdateSkill` / `useUpdateSkillPackage`. The mutations then built
`PUT /api/v1/skills/<name>` and the backend returned 404 — every
owner-edit on the live cluster was broken.

Resolve the skill through `useSkill(id)` first (which still accepts
name OR GUID on GET) and feed `skill.guid` to both write hooks. The
fallback to URL `:id` only matters on the first paint before the
query settles; both mutations are gated behind the `isLoading` /
`!skill` early returns, so the user cannot click a button while the
resolved id still points at the name. Same pattern as
`useSkillDetail.ts:77` and `MySkillsPage.tsx:66`.

End-to-end QA on the local cluster (2026-05-22) confirmed the
failure (`PUT /skills/qa-528-… → 404`). After this change, owner-edit
hits the right GUID and the package upload succeeds.

Coverage: new `EditSkillPage.test.tsx` mocks `useSkill` and verifies
both write hooks receive `skill.guid`, not the URL `:id`.

Closes #565
NyxLlmClient previously hard-coded `{gatewayUrl}/responses` for both
stream() and complete(), ignoring the admin-configured
`apiFormat: chat-completion | responses` on the provider. DeepSeek and
other OpenAI-compatible providers without a `/responses` endpoint
returned 404 on every skill generation request — the UI surfaced this
as a misleading "LLM Gateway error (404)" while the model, key, and
gateway were all configured correctly.

Fix:
- `LlmProviderResolution` now carries `apiFormat`; `bootstrap.ts`
  threads `provider.apiFormat` through `resolveLlmProviderForSurface`.
- `NyxLlmClient.stream()` and `.complete()` dispatch on apiFormat:
  - `responses` → `POST {gatewayUrl}/responses` (unchanged body).
  - `chat-completion` → `POST {gatewayUrl}/chat/completions` with a
    translated body (input→messages, developer→system,
    max_output_tokens→max_tokens, instructions prepended as system
    message, tools projected into OpenAI function-tool shape).
- Chat Completions SSE chunks are normalized — each
  `choices[].delta.content` becomes a Responses-API
  `response.output_text.delta` event, so consumers (skill generation,
  playground) stay format-agnostic.

Tool-call delta normalization for chat-completion is intentionally
out of scope here. Without it, runtime/mixed skills under
chat-completion providers still render `execute_in_sandbox(...)` as
plain text instead of triggering the sandbox. That is #608.

Coverage: new colocated `llm.test.ts` mocks fetch and asserts URL,
method, headers, body shape, and stream event translation for both
formats — 11 tests, all green. The 18 pre-existing test failures on
develop (validateSkillFrontmatter #649, zod ZodErrorMap shape) are
unrelated.

Fixes #574
* fix(api): reuse chrono-sandbox session across tool rounds (#531)

The playground tool-use loop always hit the one-shot `/execute`
endpoint, so every `execute_in_sandbox` round started in a fresh
kernel. Anything an earlier round installed (the `nyxid` CLI, npm
packages, generated files, env writes, login state) was lost on the
next call — multi-step skills appeared to work locally for each call
but the chat-level state never accumulated.

Fix:

- `chat()` now keeps a per-stream `Map<language, sessionId>` plus a
  `createdSessionIds` list. The first `execute_in_sandbox` for a
  language lazily calls `sandboxClient.createSession({ language,
  dependencies, env, inputFiles, ttlSecs: 600, networkEnabled: true })`
  and records the id; later same-language calls reuse it via
  `sessionExecute(sessionId, ...)`. A different language inside the
  same chat (e.g. JS then Python) gets its own session.
- The whole loop is wrapped in `try / finally`; on exit we
  `Promise.allSettled(deleteSession(...))` for every session we
  created, swallowing errors and relying on the chrono-sandbox TTL
  as a backstop.
- `executeToolCall` now delegates `execute_in_sandbox` to
  `runSandboxToolCall`, with a shared `formatSandboxResult` helper.
  `load_skill` and the unknown-tool branch are unchanged.

Fail-open fallbacks keep the playground at least as good as before
if the session layer is down:

- `createSession` throws → log `warn`, fall back to one-shot
  `execute()` for that round (no session recorded).
- `sessionExecute` throws → log `warn`, drop the stale id from the
  map (next same-language call recreates), fall back to one-shot
  `execute()`.

Plain (non-sandbox) chats stay untouched — no createSession, no
deleteSession, zero added overhead.

Coverage: new colocated `chatService.test.ts` with a recording
sandbox stub and a queued-event stub `NyxLlmClient` — asserts that
two same-language calls produce one createSession + two
sessionExecute (id reused, no /execute), different languages produce
two sessions, finally always deletes, both fail-open branches
exercise one-shot, plain chat creates nothing, and delete failure
is swallowed. 8 tests, all green; full ornn-api suite 722 pass / 18
unrelated pre-existing failures (validateSkillFrontmatter #649).

Fixes #531

* chore: retry ci (codecov upload flake)
…715) (#736)

NyxID's DELETE /services/:id is a soft delete that flips is_active to
false. The Ornn DB aggregation behind /skill-facets/system-services
reads nyxidServiceId/slug/label straight off skill documents, so any
skill ever bound to a now-deactivated service kept surfacing the
service as a usable filter chip. Per-caller paths
(/me/nyxid-services, /nyxid-services/:serviceId/skills) already
filtered is_active=false inside NyxidServiceClient, but the 60-second
catalog cache widened the post-deletion visibility lag.

Fix:

- NyxidServiceClient.listActiveServiceIdsAsPlatform(saToken) (new):
  SA-token fetch of NyxID's /services projected to a Set<string> of
  active ids, kept in a one-slot cache (SA view is uniform across
  callers). Fail-soft: returns null on non-2xx or thrown fetch so
  callers can fall through to legacy behaviour when NyxID is
  unreachable.
- /skill-facets/system-services intersects the DB aggregation with
  that active set when the client + SA accessor are wired in; falls
  through to the raw aggregation otherwise. Bootstrap passes both.
- cacheTtlMs 60s -> 10s. After a NyxID deactivation every surface
  that goes through findVisibleToCaller drops the service within at
  most 10s instead of 60s.
- invalidateCache() also clears the platform cache.

Skill detail still shows the historical nyxidServiceId/slug/label
even when the service is deactivated — out of scope here. The issue
lists multiple acceptable mitigations; closing the discovery surface
is the highest-leverage one. A follow-up can mark the detail panel
as "service unavailable" if a louder signal is wanted.

Coverage: 8 colocated tests in service.test.ts — per-caller
is_active drop (defence-in-depth), missing is_active default,
platform method URL + auth header, caching, fail-soft on 5xx and
network throw, empty-SA short-circuit, invalidateCache re-fetch.
748 tests pass total / 18 unrelated pre-existing failures.

Fixes #715
…anguage (#737)

Closes a batch of leak points where the global Chinese (and in one case
English) switch translated the outer chrome but specific surfaces kept
rendering literal English strings via placeholder=, aria-label=,
hardcoded JSX text, or Zod inline messages.

- BroadcastsPage row dates: toLocaleString(undefined, fmt) now uses
  i18n.language so dates follow the UI language, not the browser
  locale (#731).
- skillCreateSchemas: hardcoded Zod messages move into a
  makeBasicInfoSchema(t) / makeContentSchema(t) factory pair routed
  through guided.validation.* keys; CreateSkillGuidedPage memoizes a
  per-`t` localized schema so a language switch rebuilds the form
  resolver (#695). Static schemas stay exported for type derivation.
- MarkdownEditor: Preview button label, empty-preview placeholder,
  and the Markdown help-text under the input now route through
  markdownEditor.* (#696). The Broadcast drawer itself was already
  fully t()-ified.
- AdminUsersTable: column headers, filter placeholder, empty state,
  Grant quota action, lastActive "Never" fallback, and the sort
  aria-label route through adminUsersTable.*. COLUMNS moved inside
  the component as a useMemo keyed on t (#697).
- AdvancedOptionsModal already called t() with English fallbacks for
  the NyxID binding panel — the nyxidService.* keys were never
  defined in either JSON. Added the full Chinese set + codified the
  English fallback as the official en.json entry so translations
  actually resolve at runtime (#719).
- Sidebar mobile-header "Navigation" and the 9 hardcoded
  SettingsNav entries (LLM Providers / Playground / Skill Generation
  / GitHub Mirror / NyxID Integration / Skill Auditing / PostHog /
  Service Binding List / Export / Import) now route through
  sidebar.navigation / adminSettingsNav.* (#722).

JSON additions: 5 new top-level namespaces in both en.json and
zh.json (nyxidService, markdownEditor, adminUsersTable, sidebar,
adminSettingsNav) plus a guided.validation block. No deletions or
renames in existing keys.

Out of scope: AnnouncementsPage shares the same toLocaleString(undefined)
pattern as BroadcastsPage and can ride a follow-up. Pre-existing
validateSkillFrontmatter #649 test failures (6) are unrelated.

Fixes #695, Fixes #696, Fixes #697, Fixes #719, Fixes #722, Fixes #731
…738)

The endpoint is Bearer-token-authenticated; no cookies. The lone
`credentials: "include"` on activityApi.ts (the rest of the SPA's
apiClient never sets it) forced the browser to demand
Access-Control-Allow-Credentials: true and a specific (non-*)
Access-Control-Allow-Origin on the preflight response — which the
NyxID proxy doesn't emit for this endpoint — so the OPTIONS failed
and every login on ornn.chrono-ai.fun ate a TypeError: Failed to
fetch in the console. Authenticated GETs through the same proxy
keep working because they're simple requests.

Dropping the flag brings the call into line with every other
authenticated POST in the SPA.

Fixes #709
#740)

Two issues piled up after a version delete inside the All-versions
modal:

1. useDeleteSkillVersion invalidated SKILLS_KEY / MY_SKILLS_KEY /
   per-version audit but missed SKILL_VERSIONS_KEY — the very query
   the modal subscribes to. The deleted row stayed visible and a
   second click hit the backend with SKILL_VERSION_NOT_FOUND.
2. ToastContainer and Modal both sat at z-50; with a modal open the
   toast portal rendered behind it and the in-modal action's
   success/error feedback became dim and unreadable.

Fix: add the missing invalidation on success, and bump
ToastContainer to z-[60] so toasts always sit above modal overlays.
Single source of truth — no per-call escape hatches.

Fixes #699
…rs (#698) (#742)

useSectionForm joined `i.message` only, so admin-settings sections
that use `.min(1)` as a required-field gate (Mirror, NyxID
Integration, …) collapsed N field errors into a wall of identical
"Too small: expected string to have >=1 characters" entries with no
indication of which field needed filling.

Fix: prefix every issue with `path.join(".")`, and for the
too_small/string/min=1 triple rephrase to "is required". Renders as
"owner: is required; repo: is required; branch: is required; ..."
— actionable from the alert alone, no per-section schema rewrites
required. Per-section schemas that already ship friendlier custom
messages (`.min(1, "Owner is required")`) flow through unchanged.

Fixes #698
…739)

handleResponse parsed only RFC 7807 problem+json fields (body.code /
.detail / .title) on non-2xx, so legacy responses like
{ data: null, error: { code: "INVALID_SETTING", message: "..." } }
fell through to the generic "An unexpected error occurred" literal —
losing actionable detail from LLM provider sync, settings
validation, and anything still funnelling through
AppError → buildErrorEnvelope.

Fix: on non-2xx, try body.error.{code,message} first (the more
actionable shape when present), then RFC 7807 fields, then the
generic literal. Per-domain onError handlers already pass the
ApiClientError through translateError(err, fallback), so the
message reaches the toast/banner unchanged.

No new unit test: the existing module-init chain
(apiClient → authStore → ...) doesn't load cleanly under vitest
without additional setup, and the change is localized to one
branch. Manual repro per the issue body (sync against an invalid
model-list URL; save PostHog config with a loopback host).

Fixes #694
…x-time (#721) (#744)

buildSkillContext injected KEY=value pairs into the developer
message so the model could thread env values through to
execute_in_sandbox. When a chat-completion provider returned the
tool call as assistant text instead of a structured tool-call
frame, the secret value appeared verbatim in the user-visible
transcript. Even with the wire-level redaction the QA evidence
relied on, the bug was that the secret was in the LLM prompt at
all.

Fix:
- buildSkillContext now lists only the env var names the user
  provided, with placeholder `KEY=<provided-server-side>` and an
  instruction telling the model to reference by name. The model
  never has the literal value.
- runSandboxToolCall merges request.envVars on top of args.env
  before invoking the sandbox — user-supplied keys always win at
  execution time. Keys the model invents (sentinel markers) ride
  through unchanged.

Even if a regression lets a model serialize the tool call as text
again, the transcript carries placeholders, and the sandbox still
runs with real values because the merge happens server-side
before sessionExecute / one-shot execute.

Coverage: 2 new tests in chatService.test.ts — user-override-wins
(real value replaces model's guess, untouched keys ride through)
and no-envVars passthrough (model-only env reaches sandbox
unchanged). 10/10 file tests green.

Fixes #721
…eholder, shared-via-org name (#716 #725 #727 #729) (#745)

- #716: /admin/mirror now lazy-mounts MirrorPage again (counts +
  manual reconcile + status header). MirrorSection's "Open mirror
  dashboard" link points at /admin/mirror instead of /admin/skills.
  Pre-fix the redirect-to-settings landed admins on the same page
  with no manual-reconcile control, and the link wandered into the
  skills list.
- #725: mode-selection card description gets min-h so the bullet
  list below starts on the same baseline across all four cards
  regardless of description length (Free Mode was visibly shorter).
- #727: keyword search placeholder claimed tags were searchable;
  the backend keyword path doesn't match tags. Trimmed "or tags"
  from both en/zh strings + the in-component fallback.
- #729: shared-via-org SkillCard sub-line resolves the org name
  from useMyOrgs() and renders viaOrganizationNamed; falls back to
  the generic "Via organization" copy when the lookup misses.

Out of scope: #723 (notifications scroll — needs RootLayout
overflow investigation), #726 (semantic empty-query UX state).
Both follow-up.

Fixes #716, Fixes #725, Fixes #727, Fixes #729
SandboxClient.post throws an Error whose message is "Sandbox
service error (<status>): <raw body>", and the playground catch
spat that string straight into the chat — the QA's nyxid LIST
CAPABILITIES repro surfaced raw JSON
{"error":"internal_error","error_code":1006,"message":"An internal
error occurred"} in the transcript, presented as if the user had
done something wrong.

Fix: new formatSandboxError parses the wrapper, peels out the
structured envelope when present, and returns one of:
- 500 / 1006 → "Sandbox is having trouble running this script
  [code N]. ... try again, or simplify if it keeps failing."
- 503 / 504 → "Sandbox timed out / temporarily unavailable [code
  N]. Try again in a few seconds."
- anything else → "Sandbox execution failed (HTTP X) [code N]:
  <message>"
- non-JSON body → raw fallback so a new upstream shape still
  reaches operators.

Structured Pino error log carries language, scriptLen, raw
err.message so admins can grep prod for 1006-class failures
without scrolling chat transcripts.

The chrono-sandbox 500 itself is out of scope here — that lives
inside the sandbox runtime, not Ornn. This is about the
playground surface of the failure.

Fixes #530
)

The bell badge subscribes to useUnreadNotificationCount (polled
every UNREAD_POLL_MS=30s) and the dropdown to useNotifications (no
refetchInterval, only staleTime: 10s). When a new targeted
broadcast landed, the count poll ticked to 1 and the badge
updated — but useNotifications only refetched on remount or
invalidation, so an open / still-fresh dropdown showed the
pre-broadcast list. Users saw "1 unread" plus a list that didn't
contain it.

Fix: mirror the count's refetchInterval on useNotifications. Both
queries tick together; refetchIntervalInBackground: false matches
the count so a hidden tab spends nothing. staleTime stays 10s so
quick toggles still serve cache.

Fixes #728
auditSummaryByVersion returns the latest *completed* audit per
version, so a newer failed rerun was invisible — Skill Detail kept
rendering the stale prior score like nothing had changed. Admins
only learned the rerun failed by opening Audit History.

Fix: useSkillDetail already loads versionAuditHistory (newest-first
across all statuses) for the in-progress spinner. Compute one more
derived flag — versionAuditLatestFailed = history[0].status ===
"failed" && createdAt newer than the displayed completed audit —
and thread it to AuditVerdictPill as the new `latestRerunFailed`
prop. The pill keeps the prior completed score so admins still see
the last-good number, and renders a danger-toned banner below
("Latest rerun failed — score above is from the prior audit.
Check audit history for details.").

No backend change. The latest-of-any-status signal was already in
hand from the running-poll path; it just wasn't propagated.

Fixes #718
#747)

- #723: NotificationsPage had no own scroll surface; RootLayout's
  <main> is overflow-hidden, so once the list outgrew the viewport
  older entries were unreachable. Wrapped body in an h-full
  overflow-y-auto shell, same pattern UploadSkillPage uses for the
  same constraint.
- #726: ExplorePage now detects mode==="semantic" && empty query
  and swaps the generic "No skills match" EmptyState for an
  explicit "Enter a search description" / "Semantic search needs a
  description of what you're looking for. Type a phrase in the
  search box above, or switch back to Keyword mode." (with zh
  translation). Backend queries still fire — they're cheap and
  cached — but the user sees a clear validation hint instead of a
  misleading null result.

Fixes #723, Fixes #726
… (#748)

Backend: SearchService.search now applies a zero-trust post-filter on
scope=shared-with-me — any item whose myAccessReason=shared-via-org
points at an org NOT in the caller's current userOrgIds gets dropped.
applyScope already gates this at the DB layer; this is defence-in-
depth against cache lag, partial replication, and future regressions.
Drops emit a warn-level log so data drift is observable.

Frontend (PermissionsModal):
- Orgs: fetchOrgSummary entries carry isUnresolved; unresolved rows
  get a warning-toned background, an "unresolved" badge with triangle
  icon, and a tooltip. Click the checkbox to revoke.
- Users: resolveUsers leaves placeholder { userId, email:"",
  displayName: userId } when a lookup misses — that signal drives a
  danger-toned chip + triangle icon + tooltip. The existing × button
  revokes.

Owners now see every grant they've made AND see which ones point at
entities that don't resolve any more, with one-click revoke.

Fixes #720
* docs: prep release v0.9.0

Adds the dated release-notes file (.github/release-notes-20260529.md) so the
upcoming develop → main cut produces a tagged v0.9.0 GitHub Release with
curated user-facing notes.

~97 changesets have accumulated on develop since v0.8.0. The highest
fixed-pair bump is minor, so both packages go 0.8.0 → 0.9.0 in fixed mode.
The release is API-contract heavy: 5 breaking changes to /api/v1/* (RFC 7807
envelope, lowercase_snake_case codes, GUID-only version writes, RFC 8594
deprecation headers, ownerId removal) — all pre-1.0 minor bumps per convention.

Notes follow the template: 6-12 word product-level bullets, breaking changes
flagged, purely-technical items collapsed into the trailing bucket per section.
All three sections present, no (write here) placeholder — check-release-notes.yml
will pass.

Closes #753.

* docs: add empty changeset for v0.9.0 release prep

check-changeset.yml requires every develop PR to carry a changeset. This
release-prep PR is docs/CI-only, so it ships an empty changeset (no package
bump) — same pattern as the v0.8.0 prep PR (#556). The actual version bump is
declared by the ~97 feature changesets already on develop.
// `randomBytes(N)` returns exactly N bytes — loop bound is N, so
// `bytes[i]` is always defined. `!` is safe under
// noUncheckedIndexedAccess (#450).
out += REDEMPTION_CODE_ALPHABET[bytes[i]! % REDEMPTION_CODE_ALPHABET.length];
@chronoai-shining chronoai-shining merged commit 116a985 into main May 29, 2026
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants