This directory contains examples demonstrating multimodal AI inference using ExecuTorch with various backends (XNNPACK, Metal).
| Directory | Description | Model |
|---|---|---|
| ask-anything-app | Web app with camera + chat interface | Gemma3 Vision + Whisper |
| text-runtime | Text generation | Qwen3-0.6B |
| text-image-runtime | Vision-language inference | Gemma3 4B |
| voice-runtime | Speech-to-text | Whisper Tiny |
| object-detection-runtime | Object detection | YOLO26m |