Hi! As you may have seen, I've been porting agent-device to Dart and trying to maintain some parity with upstream agent-device.
One challenge I'm seeing both for Flutter and React Native apps is the flakiness. Do you see it as well in your tests? What I mean is that sometimes even the og agent-device misses some taps or is not able to scroll.
I've also noticed that iOS runner does not execute gestures like pan/rotate/fling on the RN test app - is that expected?
I decided to build some benchmarks to see if this issue is coming from vibe-coding the agent-device-dart, or is it inherent to the runner approach. The main difference between agent-device and agent-device-dart is that the latter does not use daemon approach.
Things that made the port look good (credit to your design):
- After a clean runner build, snapshot/interaction parity is close. Your daemon keeps per-command device ops fast; the Dart port (no daemon) is a bit faster with cold-open and process startup but each command is slightly slower.
- The fixture + replay format made a rigorous A/B benchmark possible at all
Results:
Full disclosure - my port is fully vibe coded and I'm going with it mostly as an experiment ;) No intention to build competitor.
Hi! As you may have seen, I've been porting agent-device to Dart and trying to maintain some parity with upstream agent-device.
One challenge I'm seeing both for Flutter and React Native apps is the flakiness. Do you see it as well in your tests? What I mean is that sometimes even the og agent-device misses some taps or is not able to scroll.
I've also noticed that iOS runner does not execute gestures like pan/rotate/fling on the RN test app - is that expected?
I decided to build some benchmarks to see if this issue is coming from vibe-coding the agent-device-dart, or is it inherent to the runner approach. The main difference between agent-device and agent-device-dart is that the latter does not use daemon approach.
Things that made the port look good (credit to your design):
Results:
Full disclosure - my port is fully vibe coded and I'm going with it mostly as an experiment ;) No intention to build competitor.