Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/maestro-compat-debt-map.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Maestro Compatibility Debt Map

This map summarizes the Maestro compatibility surface after the lane 6 audit. LOC values are approximate and come from `wc -l` on the owning files; rows that share a file use an estimated slice. The current compat implementation is about 4.2k LOC under `src/compat/maestro`, with about 1.8k LOC of focused compat unit tests plus daemon/provider integration guards.

| Area | Owning files | Approx LOC/code size | Custom handling summary | Native overlap and dependencies | Reliability or faster convergence opportunity | Risk | Test guard status | Recommendation | Suggested PR lane |
| --- | --- | ---: | --- | --- | --- | --- | --- | --- | --- |
| Parser and command mapping | `src/compat/maestro/replay-flow.ts`, `command-mapper.ts`, `support.ts`, `types.ts`, plus parse-side slices of `interactions.ts`, `device-actions.ts`, `flow-control.ts`, `run-script.ts` | ~1.5k LOC | Parses multi-document YAML, splits config/commands, resolves env, tracks rough top-level line numbers, flattens hooks, maps commands to `SessionAction`, rejects unsupported commands/fields loudly, and emits `__maestro*` runtime commands for semantics that native replay does not model. | Depends on `yaml`, replay `SessionAction`, `AppError`, and native commands such as `open`, `click`, `fill`, `type`, `wait`, `is`, `scroll`, `swipe`, `keyboard`, and `screenshot`. | Common replay parsing helpers could own env precedence, source-path handling, and line diagnostics. Native command builders could reduce stringly `SessionAction` construction for commands that already match `.ad`. | Line mapping is intentionally approximate for nested lists; parse-time expansion can hide runtime structure; unsupported syntax must keep failing loudly. | Strong: `replay-flow.test.ts` covers supported subset, env, hooks, runFlow, repeat, retry, launch reset, unsupported fields, fixture parsing; `replay-input.test.ts` covers backend routing/env precedence. | Keep Maestro YAML grammar local. Converge shared replay env/source/line utilities and typed action construction where native commands already exist. | Lane 7: parser contract hardening |
| Runtime target matching | `src/compat/maestro/runtime-targets.ts` | 926 LOC | Implements Maestro-specific selector resolution over raw snapshots: id/label/text/value equality-or-regex matching, visible text fuzzy fallback, `childOf`, `index`, visible-only filtering, React Native overlay blocking, foreground duplicate preference, Android rectless hidden navigation handling, tab-strip slot inference, localized breadcrumb selection, and tap-target ancestor promotion. | Uses native selector parsing/matching (`parseSelectorChain`, `matchesSelector`), native visible predicates, snapshot text normalization, and React Native overlay detection. | Extract a reusable snapshot target resolver with policy hooks for ranking, visibility, overlay filtering, and promotion. Native `click`/`wait` could then opt into the generic parts without inheriting Maestro ranking. | Highest debt concentration. Heuristics encode real platform quirks and can regress seemingly unrelated flows, especially Android duplicate/hidden nodes and broad containers. | Strong: `runtime-targets.test.ts` is large and focused; PR #620 adds provider-level Android fresh-snapshot guard through Maestro replay. | Split generic snapshot traversal/filtering from Maestro ranking policy. Keep fuzzy/regex/read-order compatibility local. | Lane 7: target resolver extraction |
| Input and focus | `replay-flow.ts` input coalescing, `interactions.ts` input/erase/paste/pressKey slices, `runtime.ts`, daemon Maestro fallback in `interaction-touch.ts` | ~350 LOC plus shared daemon flags | Coalesces `tapOn` + `inputText` + `pressKey Enter` into `wait`/`fill`/enter for likely text-entry selectors, leaves focused-field input as `type`, maps `eraseText` to backspaces, falls back from keyboard enter to newline typing, and allows Maestro non-hittable coordinate fallback for tap selectors. | Uses native `fill`, `type`, `keyboard`, `click`, replay variable substitution, and daemon touch fallback flags. | A native "focus then fill" replay primitive and focused-field clear/erase operation would remove most parser heuristics while improving `.ad` flows too. | Text-entry detection is name-based; focused-field commands depend on existing device focus; non-hittable coordinate fallback is intentionally compatibility-only and can mask poor selectors. | Moderate: parser coalescing tests in `replay-flow.test.ts`; daemon fallback covered in `interaction.test.ts`; no broad provider guard for focused erase/paste. | Converge on native focus/fill/clear semantics. Keep Maestro's optional/non-hittable quirks as compat flags. | Lane 8: native input primitive |
| Assertions and waits | `runtime-assertions.ts`, wait/assert slices of `interactions.ts`, `runtime-flow.ts` visible condition path | ~450 LOC | Polls raw snapshots for `assertVisible`, adds a terminal grace capture, treats `assertNotVisible` as stable hidden after repeated samples or timeout, computes animation stability from snapshot signatures, maps `extendedWaitUntil`, and waits briefly for `runFlow.when.visible` while keeping `notVisible` point-in-time. | Uses native `snapshot`, `wait`, `is`, visible predicates, reference frames, replay blocks, and timeout helpers. | A shared waiter/poller utility with explicit grace, stable-hidden, raw-snapshot, and timeout policies would let native `wait` and Maestro share mechanics without sharing defaults. | Timeout and stability semantics are subtle; raw full snapshots are expensive; making `notVisible` wait would change cleanup-flow behavior. | Good: `runtime-assertions.test.ts` covers deadline/hidden edge cases; `runtime-flow.test.ts` covers visible wait and immediate notVisible. | Extract generic polling/stability helpers; keep Maestro timing constants and condition semantics local. | Lane 8: waiter convergence |
| Scroll, swipe, and geometry | `points.ts`, `runtime-geometry.ts`, geometry/scroll/swipe slices of `interactions.ts` and `runtime-interactions.ts` | ~500 LOC | Parses absolute and percentage points, converts percentage taps/swipes through raw snapshot reference frames, caches reference frame in replay scope, biases tap points for large text containers and bottom tabs, loops `scrollUntilVisible` with `wait`/`find` probes, maps `swipe.label`, and uses an Android horizontal content-lane adjustment. | Uses native `click`, `scroll`, `swipe`, `find`, `wait`, touch reference frames, and raw snapshots. | A native gesture planner for percent coordinates, target-derived swipes, and frame caching would speed up convergence. Platform-specific lane policies can remain pluggable. | Geometry heuristics are device/frame sensitive; stale or missing raw snapshot frames break percent gestures; Android horizontal swipes depend on app layout assumptions. | Good: `runtime-geometry.test.ts`, `runtime-interactions.test.ts`; PR #620 provider guard asserts fresh snapshots and Android content-lane swipe coordinates. | Extract generic percent/target geometry and frame caching. Keep Maestro Android content-lane and tap bias policies local until broader demand exists. | Lane 8 or 9: gesture planner |
| Flow control, `runFlow`, retry, and hooks | `flow-control.ts`, `runtime-flow.ts`, hook flattening in `replay-flow.ts` | ~700 LOC | Handles `onFlowStart`/`onFlowComplete`, file and inline `runFlow`, per-block env, static platform gates, limited `when.true` boolean/platform expressions, runtime visible/notVisible gates via batch steps, parse-expanded `repeat.times`, and runtime `retry` via replay retry blocks. | Depends on replay block runtime (`invokeReplayActionBlock`, `invokeReplayRetryBlock`), replay vars/env, batch step projection, and native snapshot visibility resolution. | A native replay block AST for conditional blocks, deterministic repeats, and retry would remove parse-time flattening and make line/step reporting more faithful. | `repeat.times` materializes actions with a guardrail; nested line numbers are lossy; expression support is intentionally tiny; visible conditions are snapshot-dependent. | Strong: `replay-flow.test.ts` covers hooks/runFlow/env/repeat/retry/platform gates/expressions; `runtime-flow.test.ts` covers runtime condition behavior; PR #620 covers provider retry/runFlow path. | Converge on native replay control-flow primitives, but keep Maestro expression grammar and unsupported `repeat.while` local until native runtime has loop semantics. | Lane 7 or 9: replay block AST |
| `runScript` | `run-script.ts`, `runtime.ts` runScript branch | 229 LOC plus router branch | Executes trusted flow-local JavaScript with `node:vm`, exposes env values, `output`, `json`, and synchronous-looking `http.post` through a timeout-bounded child Node process; exports output as `output.<key>` replay variables. | Uses replay variable scope, `runCmdSync`, and `AppError`; there is no native `.ad` command equivalent. | Shared env/output variable plumbing could converge, but script execution should not become native without a separate security model and product decision. | High security and determinism risk: `node:vm` is not a sandbox, scripts can make network requests, async support is intentionally narrow, and output key rules are compatibility-specific. | Partial: parser/order/env/path behavior covered in `replay-flow.test.ts`; docs describe trust/security limits; no focused execution tests for `http.post`, `json`, or output validation. | Keep compat-local. Add focused execution tests before expanding helpers; do not expose as native command until sandboxing and trust model are explicit. | Lane 9: runScript guard tests |
| Suite discovery and test artifacts | `src/daemon/handlers/session-test-discovery.ts`, `session-test-suite.ts`, `session-test-artifacts.ts`, replay grammar/backend plumbing | ~120 Maestro-specific LOC in daemon/test path | When `--maestro`/backend is selected, discovers `.ad`, `.yaml`, and `.yml`, allows untyped YAML through platform filtering, runs through native replay test suite, and preserves original Maestro flow filenames in artifacts. | Native test suite runner, replay backend selection, session lifecycle, artifact materialization, and CLI `--maestro` flag. | Mostly converged already. The remaining improvement is making replay backend extension/filter policy data-driven so future backends avoid daemon conditionals. | Low. Main risk is accidentally including non-Maestro YAML when backend is set, or changing platform-filter behavior for untyped YAML. | Good: `session-test-discovery.test.ts`, `session-test-suite.test.ts`, `session-test-artifacts.test.ts`; PR #620 provider suite runs a Maestro YAML test. | Keep small daemon hook for now; extract backend discovery policy only if another replay backend appears. | Lane 6 complete / no follow-up unless backend count grows |
| Docs and support matrix | `support-matrix.ts`, `website/docs/docs/replay-e2e.md`, CLI flag/help tests, issue tracker references | 42 LOC source matrix plus docs | Maintains supported/unsupported capability lists, formats CLI help copy, links tracker/new issue URLs, and keeps replay docs synced with the source matrix. | CLI flag definitions/help rendering and website docs. | Generate or embed the support list from one source in docs/help to remove manual prose drift. | Medium user-facing risk: stale docs can imply unsupported parity or hide known gaps. | Good: `support-matrix.test.ts` asserts CLI help and docs stay synced with the shared matrix. | Keep `support-matrix.ts` as source of truth; update it with every behavior change and mirror only explanatory context in docs. | Lane 6 complete / ongoing docs hygiene |

## Convergence Priority

1. Target matching is the largest and riskiest debt. Extract reusable snapshot traversal/filtering first, then keep Maestro ranking rules as a policy.
2. Replay control-flow should converge before adding more Maestro flow syntax. A native block AST would make `runFlow`, `retry`, and future loop support less brittle.
3. Input/focus and wait semantics are good candidates for native primitives because they improve `.ad` replay as well as Maestro compatibility.
4. `runScript` should remain compatibility-local unless a separate secure native script story is designed.
Loading