Skip to content

Track Maestro flow compatibility for replay --maestro #558

@thymikee

Description

@thymikee

Problem

replay --maestro and test --maestro are compatibility paths for running common Maestro YAML flows through Agent Device replay and test execution. This issue tracks the supported subset and the remaining compatibility work so we do not imply full Maestro parity accidentally.

Reference docs:

Architecture status

  • Keep Maestro parsing and command mapping isolated in src/compat/maestro.
  • Keep Maestro runtime compatibility shims isolated in src/daemon/handlers/session-replay-maestro-runtime.ts.
  • Route CLI --maestro through generic replay/test backend plumbing.
  • Emit replayable SessionAction[] from the compat layer.
  • Keep lower-level iOS support explicit and opt-in for compatibility-only behavior, such as non-hittable selector taps.
  • Reject unsupported commands/fields with precise command names and line numbers.
  • Document the supported subset in CLI help.
  • Cover the test-app Maestro fixture with parser/runtime tests.

Public app research

Sampled public Maestro suites:

Test suite execution

  • agent-device test --maestro <path-or-glob>... runs Maestro compatibility flows through the replay test suite runner.
  • test --maestro --shard-all <n> runs the full runnable suite on each selected device.
  • test --maestro --shard-split <n> distributes runnable suite entries across selected devices.
  • test --maestro --device <id,id> accepts comma-separated explicit shard devices.
  • Sharded test runs can select connected or booted devices when an explicit device list is omitted.
  • Sharded test runs scope sessions, request IDs, and artifacts per shard.
  • Sharded test results and JUnit output include shard/device metadata for disambiguation.
  • Replay test flows receive native shard/device variables: AD_DEVICE_ID, AD_SHARD_INDEX, and AD_SHARD_COUNT.

Compatibility triage

Easy mappings should translate Maestro syntax to existing Agent Device commands without adding new runtime semantics. Runtime shims are acceptable only when scoped to Maestro compatibility and covered by parser/runtime tests.

Easy next mappings

  • setAirplaneMode true/false -> settings airplane on|off.
  • setLocation -> settings location set <latitude> <longitude>.
  • setOrientation -> rotate <orientation>.
  • setPermissions for supported targets -> settings permission grant|deny|reset <target>.
  • killApp -> close <appId> where Maestro provides or config supplies the app id.
  • pasteText with a string payload -> type <text> if we accept text-entry equivalence instead of clipboard semantics.
  • startRecording / stopRecording -> record start|stop when Maestro options map cleanly.
  • assertTrue only for literal boolean expressions (true/false), if useful; broader JavaScript expressions stay deferred.
  • launchApp.arguments / launchArguments -> simulator launch arguments.

Sampled syntax still deferred as feature work

  • repeat.while: requires replay runtime conditionals.
  • runFlow.when.true and full expression predicates: requires expression evaluation policy.
  • evalScript: requires a broader sandboxing and side-effect model.
  • eraseText: requires a neutral clear-field capability; empty fill is intentionally unsupported.
  • copyTextFrom: requires element text extraction into a variable/output model.
  • toggleAirplaneMode: requires reading current state or accepting nondeterministic toggle semantics; prefer setAirplaneMode first.
  • launchApp.permissions: requires launch-time permission mapping before app open.
  • launchApp.clearKeychain: true, clear-state command, and clear-keychain command: require neutral keychain/reset capabilities.
  • Relationship selectors such as index, childOf, above, below, containsChild, and related variants.

Runtime compatibility shims now supported

  • scrollUntilVisible: polls with wait / fuzzy text lookup and scroll probes.
  • runFlow.when.visible.
  • runFlow.when.notVisible.
  • runScript file/env with a minimal compatibility sandbox for http.post, json, and output variables.
  • tapOn.optional: best-effort skip after the tap retry window.
  • tapOn.point percentage coordinates against the current screen snapshot frame.
  • iOS non-hittable selector tap fallback for Maestro-style hidden/1x1 RN E2E controls, behind an explicit replay-only flag.

Supported flow shape and composition

  • YAML config document + command document separated by ---.
  • Command-only YAML list.
  • appId.
  • env metadata parsing.
  • ${VAR} interpolation for string fields used by supported commands.
  • ${output.KEY} interpolation after supported runScript output assignment.
  • onFlowStart.
  • onFlowComplete.
  • runFlow scalar file form.
  • runFlow.file.
  • runFlow.env.
  • runFlow.commands.
  • runFlow.when.platform.
  • runFlow.when.visible.
  • runFlow.when.notVisible.
  • runFlow.when.true / expression predicates.
  • Full JavaScript expression interpolation.

Supported commands

  • launchApp.
  • launchApp.appId.
  • launchApp.stopApp as open --relaunch.
  • launchApp.clearState: false and launchApp.clearKeychain: false as tolerated no-op compatibility.
  • launchApp.arguments / launchArguments.
  • launchApp.clearState: true for Android/iOS simulator app reset.
  • launchApp.clearKeychain: true.
  • launchApp.permissions.
  • tapOn shorthand text selector.
  • tapOn.id.
  • tapOn.text.
  • tapOn.enabled.
  • tapOn.selected.
  • tapOn.point absolute "x,y".
  • tapOn.point percentage "x%,y%" against the screen frame.
  • tapOn.repeat.
  • tapOn.delay.
  • tapOn.optional.
  • tapOn.retryTapIfNoChange.
  • tapOn.waitToSettleTimeoutMs.
  • point-within-element semantics.
  • doubleTapOn.
  • longPressOn.
  • inputText string value.
  • inputText: { text, label }.
  • random input commands.
  • openLink.
  • assertVisible.
  • assertNotVisible.
  • extendedWaitUntil.visible.
  • extendedWaitUntil.notVisible.
  • extendedWaitUntil.timeout.
  • takeScreenshot.
  • hideKeyboard.
  • pressKey.back.
  • pressKey.enter / pressKey.return.
  • pressKey.home.
  • back.
  • waitForAnimationToEnd.
  • waitForAnimationToEnd.timeout.
  • stopApp.
  • scroll scalar command.
  • scrollUntilVisible.
  • swipe absolute coordinates.
  • swipe percentage coordinates.
  • swipe.direction.
  • repeat.times.
  • repeat.times from ${VAR}.
  • repeat.while.
  • eraseText.
  • pasteText.
  • copyTextFrom.
  • runScript file/env with http.post, json, and output.
  • evalScript.
  • retry.
  • toggleAirplaneMode.
  • setAirplaneMode.
  • setLocation.
  • setOrientation.
  • setPermissions.
  • addMedia.
  • clearState command.
  • clearKeychain command.
  • killApp.
  • travel.
  • assertTrue.
  • assertScreenshot.
  • startRecording.
  • stopRecording.
  • AI commands.

Selector support

  • shorthand text.
  • id.
  • text.
  • enabled.
  • selected.
  • absolute point.
  • percentage point against the screen frame.
  • index.
  • css for web.
  • checked.
  • focused.
  • traits.
  • above.
  • below.
  • leftOf.
  • rightOf.
  • containsChild.
  • childOf.
  • containsDescendants.
  • dimension matchers.
  • Maestro-compatible regex semantics for text / id.

Known semantic gaps

  • extendedWaitUntil.notVisible currently translates to wait <timeout> then is hidden, not a polling hidden wait.
  • assertNotVisible is not a polling assertion.
  • runScript is intentionally minimal and does not provide full Maestro JavaScript globals, require, process, arbitrary async APIs, or evalScript.
  • Device-state commands, clear-state command, and keychain reset need neutral Agent Device capabilities before mapping Maestro syntax.
  • Relationship selectors such as index and childOf are rejected instead of silently ignored.

Acceptance criteria

  • agent-device replay --maestro <flow.yaml> reports unsupported Maestro commands with precise command names and line numbers.
  • agent-device test --maestro <path-or-glob>... runs Maestro compatibility flows through the replay test suite runner.
  • agent-device test --maestro supports --shard-all <n> and --shard-split <n> across selected devices.
  • Supported commands have parser unit tests.
  • Runtime shims have replay runtime tests.
  • The local test-app Maestro suite exercises supported compatibility syntax.
  • CLI help describes the supported Maestro subset and links users to this issue for missing commands.
  • The parser does not silently ignore unsupported Maestro fields that affect behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    backlogLower priority / backlogenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions