Add Gemma 4 MLX install-path support by zeel2104 · Pull Request #19065 · pytorch/executorch

zeel2104 · 2026-04-23T12:01:26Z

Summary

Enable Gemma 4 on the MLX backend through the HuggingFace export/run path.

This PR:

adds Gemma 4 support for backends/mlx/examples/llm/export_llm_hf.py
adds Gemma 4 text-only support to backends/mlx/examples/llm/run_llm_hf.py
fixes Gemma 4 hybrid-cache handling for shared KV layout and mixed sliding/full-attention cache types
makes the normal installed package path work without PYTHONPATH
limits MLX docs and CI coverage to the exact Gemma 4 configuration that was validated

This PR does not add Gemma 4 support to the internal export_llm / examples/models/gemma4/ path.

Test plan

Manual validation on Apple Silicon macOS using the installed package from .venv/site-packages:

python -m executorch.backends.mlx.examples.llm.export_llm_hf \
  --model-id google/gemma-4-E2B-it \
  --output /tmp/gemma4_custom_qlinear_only_installed.pte \
  --qlinear 4w \
  --use-custom-sdpa \
  --use-custom-kv-cache

python -m executorch.backends.mlx.examples.llm.run_llm_hf \
  --pte /tmp/gemma4_custom_qlinear_only_installed.pte \
  --model-id google/gemma-4-E2B-it \
  --prompt "What is the capital of France?" \
  --max-new-tokens 50

###Validation

-installed import path resolves from .venv/lib/python3.12/site-packages/executorch/...
-MLXBackend is registered from the installed package
-export succeeds for google/gemma-4-E2B-it with --qlinear 4w --use-custom-sdpa --use-custom-kv-cache
-runtime succeeds without PYTHONPATH
-generated output contains Paris

Additional note:

I also retried the clean editable install flow with python install_executorch.py --editable and verified that MLXBackend registers correctly there as well
I attempted a Gemma 3 smoke test, but google/gemma-3-1b-it is gated in my current environment (401 Unauthorized / GatedRepoError), so I could not complete a local Gemma 3 end-to-end rerun

pytorch-bot · 2026-04-23T12:01:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19065

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 11 Awaiting Approval, 2 New Failures

As of commit 719d2e8 with merge base d0b7934 ():

AWAITING APPROVAL - The following workflows need approval before CI can run:

NEW FAILURES - The following jobs have failed:

MLX / test-mlx-llm (google/gemma-4-E2B-it, gemma4-e2b, true, 4w) / test-mlx-llm-gemma4-e2b-custom-4w (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1
MLX / test-mlx-qwen35-moe / test-mlx-qwen35-moe (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2026-04-23T12:01:32Z

Hi @zeel2104!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

github-actions · 2026-04-23T12:02:08Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-cla · 2026-04-23T12:12:15Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

metascroy · 2026-04-23T16:48:25Z


-    # Check if model uses sliding window attention
-    sliding_window = getattr(model.config, "sliding_window", None)
+    # Check if model uses sliding window attention. Multimodal configs like


Does this regress gemma3?

I don’t expect this to regress Gemma 3. The change is just switching the sliding-window lookup to model.config.get_text_config(), which also covers the plain text config case and is needed for Gemma 4 where those attrs live under text_config. I scoped the logic to the same attribute lookup, not a Gemma-4-specific branch. I can also rerun a Gemma 3 smoke test and report back.

Yeah, it would be great to try on gemma3 as a smoke test, that would be great.

If you are unable to access the version from Google, try the unsloth version unsloth/gemma-3-1b-it (https://github.com/pytorch/executorch/blob/main/.github/workflows/mlx.yml#L469C18-L469C39)

metascroy · 2026-04-23T16:50:18Z

 logger = logging.getLogger(__name__)


+def _iter_mlx_backend_candidates():


This code should not be needed. Did you do:

python install_executorch.py --editable

on a mac machine with xcode installed? If so, in the install logs, did you see a comment about MLX installation being skipped for some reason?

metascroy · 2026-04-23T16:52:08Z

    }

    try {
+      std::cerr << "MLX init: constructing handle" << std::endl;


Remove debug logging?

Yes, this was debug-only while I was chasing the install/runtime registration issue. I’ll remove the std::cerr logging before merge.

metascroy · 2026-04-23T16:56:01Z

Looks fantastic!

A couple questions:

Are we regression gemma3 support at all?
Does it work without "--use-custom-sdpa --use-custom-kv-cache" flags? If not, why? (This PR can stay focused on the custom path, I'm just curious what went wrong)
Did you try embedding quant? If so, did something go wrong?

metascroy · 2026-04-23T16:56:24Z


+        QEMBEDDING_ARGS="--qembedding ${QCONFIG}"
+        if [ "${MODEL_ID}" = "google/gemma-4-E2B-it" ]; then
+          QEMBEDDING_ARGS=""


Why no embeeding?

metascroy · 2026-04-23T16:57:35Z


    logger.info(f"Loading model from {pte_path}...")
-    et_runtime = Runtime.get()
+    et_runtime = _ensure_mlx_backend_registered()


This shouldn't be needed, see comment on the install process.

That’s fair. I added this while debugging the installed-package path locally because MLXBackend was not being registered from the installed package, and I wanted a way to keep validating the runtime path. Since the install-path issue is now fixed, I’ll remove it and rely on the normal install flow.

metascroy · 2026-04-23T17:01:00Z

    # Decode only the newly generated tokens (not the input prompt)
    new_tokens = generated_tokens[seq_len:]
-    generated_text = tokenizer.decode(new_tokens, skip_special_tokens=True)
+    generated_text = text_processor.decode(new_tokens, skip_special_tokens=True)


Does this break the path where uses_processor=False?

Can we unify these two paths somehow?

I ended up unifying this path. text_processor is now either an AutoProcessor or an AutoTokenizer, and both decode through text_processor.decode(...), so the uses_processor=False case should still work. The remaining split is only at encode time, where AutoProcessor needs processor(text=..., return_tensors="pt") and AutoTokenizer still uses encode(...).

zeel2104 · 2026-04-24T00:41:59Z

Are we regression gemma3 support at all?

Does it work without "--use-custom-sdpa --use-custom-kv-cache" flags? If not, why? (This PR can stay focused on the custom path, I'm just curious what went wrong)

Did you try embedding quant? If so, did something go wrong?

I don’t expect a Gemma 3 regression from these changes.
I kept this PR to the Gemma 4 path I could validate end to end.

I did not get a non-custom Gemma 4 path to a validated state here; the issues I hit were around Gemma 4’s hybrid/shared-KV cache layout and mixed sliding/full-attention behavior, so I focused on the custom SDPA + custom KV cache path.

I also did not land --qembedding because I wasn’t able to validate it reliably for Gemma 4. The configuration that worked consistently was:
--qlinear 4w --use-custom-sdpa --use-custom-kv-cache

That’s why docs and CI are limited to that exact configuration in this PR.

zeel2104 · 2026-04-24T01:22:59Z

@metascroy
I retried the clean editable flow locally and it works now.

I ran python install_executorch.py --editable and verified that MLXBackend is registered from the editable install, so you’re right that this runtime fallback should not be needed. I’ve removed the helper in the latest commit and now rely on the normal install flow.

zeel2104 · 2026-04-24T01:32:34Z

I tried to rerun a Gemma 3 smoke test locally, but I’m currently blocked by Hugging Face access on google/gemma-3-1b-it rather than by an MLX/export failure.

The request fails at model download with:

401 Unauthorized
GatedRepoError: Cannot access gated repo

So I wasn’t able to complete a Gemma 3 end-to-end rerun in this environment. I still don’t expect this change to regress Gemma 3, since the relevant change here is switching the sliding-window lookup to model.config.get_text_config(), which should also cover the plain text config case, but I haven’t been able to revalidate Gemma 3 locally yet due to model access.

Let me know if you’d like me to dig further into the non-custom path or embedding quant in a follow-up.

metascroy · 2026-04-24T22:36:04Z

I tried to rerun a Gemma 3 smoke test locally, but I’m currently blocked by Hugging Face access on google/gemma-3-1b-it rather than by an MLX/export failure.

The request fails at model download with:

401 Unauthorized

GatedRepoError: Cannot access gated repo

So I wasn’t able to complete a Gemma 3 end-to-end rerun in this environment. I still don’t expect this change to regress Gemma 3, since the relevant change here is switching the sliding-window lookup to model.config.get_text_config(), which should also cover the plain text config case, but I haven’t been able to revalidate Gemma 3 locally yet due to model access.

Let me know if you’d like me to dig further into the non-custom path or embedding quant in a follow-up.

For gemma3 verification, you can use the unsloth version model_id="unsloth/gemma-3-1b-it", which isn't gated. This is what we use in CI: https://github.com/pytorch/executorch/blob/main/.github/workflows/mlx.yml#L469

metascroy · 2026-04-24T22:38:16Z

Are we regression gemma3 support at all?

Does it work without "--use-custom-sdpa --use-custom-kv-cache" flags? If not, why? (This PR can stay focused on the custom path, I'm just curious what went wrong)

Did you try embedding quant? If so, did something go wrong?

I don’t expect a Gemma 3 regression from these changes. I kept this PR to the Gemma 4 path I could validate end to end.

I did not get a non-custom Gemma 4 path to a validated state here; the issues I hit were around Gemma 4’s hybrid/shared-KV cache layout and mixed sliding/full-attention behavior, so I focused on the custom SDPA + custom KV cache path.

I also did not land --qembedding because I wasn’t able to validate it reliably for Gemma 4. The configuration that worked consistently was: --qlinear 4w --use-custom-sdpa --use-custom-kv-cache

That’s why docs and CI are limited to that exact configuration in this PR.

I also did not land --qembedding because I wasn’t able to validate it reliably for Gemma 4

Can you say a bit more on what you mean by reliably? Did it fail to lower or run? Or did you run into model quality issues with quantized embeddings?

On non-custom path: I think it is fine to leave as follow-up, I was just curious about the specific errors you saw.

metascroy · 2026-04-24T22:44:06Z

      ET_LOG(Error, "MLX execute failed: %s", e.what());
      return Error::Internal;
+    } catch (...) {
+      ET_LOG(Error, "MLX execute failed: unknown non-std exception");


Did you hit this case?

No, I did not specifically hit those C++ catch-all paths.

The failures I was debugging were earlier in the flow:

Python/export-time Gemma 4 compatibility issues in the HF export path

installed/editable install issues around getting the MLX path working cleanly

the DEBUG=release editable install failure in setup.py

So those catch blocks were not the source of the Gemma 4 bring-up work here.

metascroy · 2026-04-24T22:44:42Z

      }
      return Error::InvalidProgram;
+    } catch (...) {
+      ET_LOG(Error, "Failed to load MLX program: unknown non-std exception");


Did you hit this case?

metascroy · 2026-04-24T22:45:33Z

 # is Release.
 def get_build_type(is_debug=None) -> str:
-    debug = int(os.environ.get("DEBUG", 0) or 0) if is_debug is None else is_debug
+    if is_debug is None:


Were these changes for debugging only?

No, these were not debug-only.

This came from the editable install path failing in my environment because DEBUG=release, while the existing code assumed DEBUG was always integer-like. The get_build_type() change makes that handling robust for string values like release / debug / true / false, which unblocked python install_executorch.py --editable for me.

I re-ran the editable install after this change and verified that MLXBackend registers correctly there.

I'd rather not touch setup.py for this task, unless it is actually needed.

If things work with "python install_executorch.py --editable", then let's leave these setup improvements for another PR.

zeel2104 · 2026-04-26T13:07:48Z

Are we regression gemma3 support at all?

Does it work without "--use-custom-sdpa --use-custom-kv-cache" flags? If not, why? (This PR can stay focused on the custom path, I'm just curious what went wrong)

Did you try embedding quant? If so, did something go wrong?

I don’t expect a Gemma 3 regression from these changes. I kept this PR to the Gemma 4 path I could validate end to end.
I did not get a non-custom Gemma 4 path to a validated state here; the issues I hit were around Gemma 4’s hybrid/shared-KV cache layout and mixed sliding/full-attention behavior, so I focused on the custom SDPA + custom KV cache path.
I also did not land --qembedding because I wasn’t able to validate it reliably for Gemma 4. The configuration that worked consistently was: --qlinear 4w --use-custom-sdpa --use-custom-kv-cache
That’s why docs and CI are limited to that exact configuration in this PR.

I also did not land --qembedding because I wasn’t able to validate it reliably for Gemma 4

Can you say a bit more on what you mean by reliably? Did it fail to lower or run? Or did you run into model quality issues with quantized embeddings?

On non-custom path: I think it is fine to leave as follow-up, I was just curious about the specific errors you saw.

For --qembedding, I didn’t fully characterize it for Gemma 4 before scoping it out of the PR. On the Gemma 3 smoke rerun I just did, the failure happened earlier in export in the custom KV-cache path, before I could isolate embedding quant behavior separately. So I still don’t have a clean “quality vs lower vs runtime” answer for --qembedding specifically; I just don’t have enough validated signal yet to document or CI-enable it.

zeel2104 · 2026-04-26T13:10:10Z

I tried to rerun a Gemma 3 smoke test locally, but I’m currently blocked by Hugging Face access on google/gemma-3-1b-it rather than by an MLX/export failure.
The request fails at model download with:

401 Unauthorized

GatedRepoError: Cannot access gated repo

So I wasn’t able to complete a Gemma 3 end-to-end rerun in this environment. I still don’t expect this change to regress Gemma 3, since the relevant change here is switching the sliding-window lookup to model.config.get_text_config(), which should also cover the plain text config case, but I haven’t been able to revalidate Gemma 3 locally yet due to model access.
Let me know if you’d like me to dig further into the non-custom path or embedding quant in a follow-up.

For gemma3 verification, you can use the unsloth version model_id="unsloth/gemma-3-1b-it", which isn't gated. This is what we use in CI: https://github.com/pytorch/executorch/blob/main/.github/workflows/mlx.yml#L469

I reran the Gemma 3 smoke test locally using the ungated CI model unsloth/gemma-3-1b-it, and I did hit a failure in the current custom Gemma path.

Tested with:
--qlinear 4w --qembedding 4w --use-custom-sdpa --use-custom-kv-cache

The failure happens during export, before lowering/runtime:

inside replace_hf_cache_with_mlx_ring_buffer(...)
when constructing HFStaticCache
it reaches HF early_initialization
then fails with:

TypeError: zeros(): argument 'size' failed to unpack the object at pos 2 with error "type must be tuple of ints, but got list"

So at least in my local environment, this does look like a Gemma 3 regression in the custom KV-cache path rather than just a Gemma 4-only issue.

The later run_llm_hf.py failure was only because the .pte file was never produced after export failed.

zeel2104 · 2026-04-26T13:10:50Z

Happy to investigate that Gemma 3 custom-cache regression further if you want that covered before merge, or I can keep this PR scoped strictly to the Gemma 4 path that was validated.

metascroy · 2026-04-27T17:40:48Z

Happy to investigate that Gemma 3 custom-cache regression further if you want that covered before merge, or I can keep this PR scoped strictly to the Gemma 4 path that was validated.

Let's see what CI says. You can keep the change scoped to gemma 4, but we cannot have gemma 3 regressing because of your change.

metascroy · 2026-04-27T23:57:20Z

@zeel2104 CI for gemma3 (custom path only) is failing with:

File ".../transformers/cache_utils.py", line 798, in early_initialization
    fake_keys_tensor = torch.zeros((batch_size, num_heads, 0, head_dim), dtype=dtype, device=device)
TypeError: zeros(): argument 'size' failed to unpack the object at pos 2 with error
  "type must be tuple of ints, but got list"

whereas it was previously passing. I suspect there is a breaking change in HF interfaces, and your changes for the custom path are implicitly depending on this breaking change. Can you make sure your changes work against the pin in https://github.com/pytorch/executorch/blob/main/.ci/docker/ci_commit_pins/optimum-executorch.txt (which is what we run in CI).

See the tests in https://github.com/pytorch/executorch/blob/main/.github/workflows/mlx.yml for setup (and transformer version we pin against).

Let me know if this isn't possible to do for Gemma4.

zeel2104 · 2026-04-28T01:57:46Z

Makes sense. The setup.py changes were packaging/install-path improvements rather than core Gemma 4 MLX support, so I’ll take them out of this PR and keep them for a follow-up if still useful.

zeel2104 · 2026-04-28T02:16:06Z

@zeel2104 CI for gemma3 (custom path only) is failing with:
File ".../transformers/cache_utils.py", line 798, in early_initialization
    fake_keys_tensor = torch.zeros((batch_size, num_heads, 0, head_dim), dtype=dtype, device=device)
TypeError: zeros(): argument 'size' failed to unpack the object at pos 2 with error
  "type must be tuple of ints, but got list"
whereas it was previously passing. I suspect there is a breaking change in HF interfaces, and your changes for the custom path are implicitly depending on this breaking change. Can you make sure your changes work against the pin in https://github.com/pytorch/executorch/blob/main/.ci/docker/ci_commit_pins/optimum-executorch.txt (which is what we run in CI).

See the tests in https://github.com/pytorch/executorch/blob/main/.github/workflows/mlx.yml for setup (and transformer version we pin against).

Let me know if this isn't possible to do for Gemma4.

Thanks, I tracked this down to an HF cache API compatibility issue.

My custom cache replacement had started assuming newer HF cache-layer behavior than the version pinned in CI. I updated it to handle both the older pinned interface and the newer Gemma 4-capable interface.

After the fix:

Gemma 3 custom path export works again with unsloth/gemma-3-1b-it
Gemma 4 custom path still works with google/gemma-4-E2B-it

So this should avoid the Gemma 3 regression while keeping the Gemma 4 support intact.

metascroy · 2026-04-28T17:39:47Z

Re-running CI

metascroy · 2026-04-28T18:40:45Z

Gemma3 is working again, but it looks like gemma4 failed in CI :(

2026-04-28T03:02:31.6119920Z [INFO 2026-04-28 03:01:56,682 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6120570Z [INFO 2026-04-28 03:01:56,685 run_llm_hf.py:60] Loaded processor from HuggingFace: google/gemma-4-E2B-it
2026-04-28T03:02:31.6121060Z [INFO 2026-04-28 03:01:56,685 run_llm_hf.py:103] Loading model from /tmp/gemma4-e2b.pte...
2026-04-28T03:02:31.6121430Z [INFO 2026-04-28 03:01:56,705 run_llm_hf.py:108] Model input_ids max seq len: 511
2026-04-28T03:02:31.6121840Z [INFO 2026-04-28 03:02:23,619 run_llm_hf.py:112] Encoding prompt: 'What is the capital of France?'
2026-04-28T03:02:31.6122390Z [INFO 2026-04-28 03:02:23,718 run_llm_hf.py:121] Input shape: torch.Size([1, 16])
2026-04-28T03:02:31.6122800Z [INFO 2026-04-28 03:02:23,719 run_llm_hf.py:139] Running full-prompt prefill (16 tokens)...
2026-04-28T03:02:31.6123170Z [cpuinfo_utils.cpp:71] Reading file /sys/devices/soc0/image_version
2026-04-28T03:02:31.6123520Z [cpuinfo_utils.cpp:87] Failed to open midr file /sys/devices/soc0/image_version
2026-04-28T03:02:31.6123890Z [INFO 2026-04-28 03:02:27,735 run_llm_hf.py:145] Prefill time: 4.017s (4.0 tokens/sec)
2026-04-28T03:02:31.6124270Z [INFO 2026-04-28 03:02:27,738 run_llm_hf.py:156] Generating up to 50 tokens...
2026-04-28T03:02:31.6124480Z 
2026-04-28T03:02:31.6124550Z Prefill time: 4.017s (4.0 tok/s)
2026-04-28T03:02:31.6124760Z Decode time:  2.137s (50 tokens, 23.4 tok/s)
2026-04-28T03:02:31.6124930Z 
2026-04-28T03:02:31.6125010Z ============================================================
2026-04-28T03:02:31.6125310Z Generated text:
2026-04-28T03:02:31.6125490Z ============================================================
2026-04-28T03:02:31.6125720Z The
2026-04-28T03:02:31.6125880Z ============================================================
2026-04-28T03:02:31.6126090Z + grep -iq Paris
2026-04-28T03:02:31.6126610Z + echo 'W0428 03:01:47.114000 43372 site-packages/torch/distributed/elastic/multiprocessing/redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
2026-04-28T03:02:31.6127500Z [INFO 2026-04-28 03:01:50,989 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6128460Z [INFO 2026-04-28 03:01:51,078 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6129490Z [INFO 2026-04-28 03:01:51,160 _client.py:1025] HTTP Request: GET https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6130500Z [INFO 2026-04-28 03:01:51,259 _client.py:1025] HTTP Request: GET https://huggingface.co/api/models/google/gemma-4-E2B-it/tree/main/additional_chat_templates?recursive=false&expand=false "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6131400Z [INFO 2026-04-28 03:01:51,333 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6132330Z [INFO 2026-04-28 03:01:51,407 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6133230Z [INFO 2026-04-28 03:01:51,485 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/chat_template.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6134030Z [INFO 2026-04-28 03:01:51,603 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/chat_template.jinja "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6135020Z [INFO 2026-04-28 03:01:51,693 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/chat_template.jinja "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6136030Z [INFO 2026-04-28 03:01:51,778 _client.py:1025] HTTP Request: GET https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/chat_template.jinja "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6137000Z [INFO 2026-04-28 03:01:51,858 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/audio_tokenizer_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6137910Z [INFO 2026-04-28 03:01:51,946 _client.py:1025] HTTP Request: GET https://huggingface.co/api/models/google/gemma-4-E2B-it/tree/main/additional_chat_templates?recursive=false&expand=false "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6138990Z [INFO 2026-04-28 03:01:52,020 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6139980Z [INFO 2026-04-28 03:01:52,050 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6140880Z [INFO 2026-04-28 03:01:52,127 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/chat_template.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6141690Z [INFO 2026-04-28 03:01:52,200 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/chat_template.jinja "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6142650Z [INFO 2026-04-28 03:01:52,259 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/chat_template.jinja "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6143580Z [INFO 2026-04-28 03:01:52,339 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/audio_tokenizer_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6144400Z [INFO 2026-04-28 03:01:52,415 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6145330Z [INFO 2026-04-28 03:01:52,443 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6146250Z [INFO 2026-04-28 03:01:52,516 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6147060Z [INFO 2026-04-28 03:01:52,593 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6147990Z [INFO 2026-04-28 03:01:52,622 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6148900Z [INFO 2026-04-28 03:01:52,696 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6149720Z [INFO 2026-04-28 03:01:52,768 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6150650Z [INFO 2026-04-28 03:01:52,795 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6151560Z [INFO 2026-04-28 03:01:52,892 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6152380Z [INFO 2026-04-28 03:01:52,965 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6153300Z [INFO 2026-04-28 03:01:52,994 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6154280Z [INFO 2026-04-28 03:01:53,069 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6157540Z [INFO 2026-04-28 03:01:53,143 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6158620Z [INFO 2026-04-28 03:01:53,171 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6159560Z [INFO 2026-04-28 03:01:53,256 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/tokenizer_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6160500Z [INFO 2026-04-28 03:01:53,339 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/tokenizer_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6161570Z [INFO 2026-04-28 03:01:53,421 _client.py:1025] HTTP Request: GET https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/tokenizer_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6162590Z [INFO 2026-04-28 03:01:53,499 _client.py:1025] HTTP Request: GET https://huggingface.co/api/models/google/gemma-4-E2B-it/tree/main/additional_chat_templates?recursive=false&expand=false "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6163490Z [INFO 2026-04-28 03:01:53,593 _client.py:1025] HTTP Request: GET https://huggingface.co/api/models/google/gemma-4-E2B-it/tree/main?recursive=true&expand=false "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6164260Z [INFO 2026-04-28 03:01:53,675 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/tokenizer.json "HTTP/1.1 302 Found"
2026-04-28T03:02:31.6165090Z [INFO 2026-04-28 03:01:53,759 _client.py:1025] HTTP Request: GET https://huggingface.co/api/models/google/gemma-4-E2B-it/xet-read-token/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6165930Z [INFO 2026-04-28 03:01:54,664 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/added_tokens.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6169750Z [INFO 2026-04-28 03:01:54,739 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/special_tokens_map.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6170470Z [INFO 2026-04-28 03:01:56,034 _client.py:1025] HTTP Request: GET https://huggingface.co/api/models/google/gemma-4-E2B-it "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6171200Z [INFO 2026-04-28 03:01:56,145 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6172130Z [INFO 2026-04-28 03:01:56,179 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6173060Z [INFO 2026-04-28 03:01:56,257 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/video_preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6173890Z [INFO 2026-04-28 03:01:56,361 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6174710Z [INFO 2026-04-28 03:01:56,434 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/processor_config.json "HTTP/1.1 307 Temporary Redirect"
2026-04-28T03:02:31.6175640Z [INFO 2026-04-28 03:01:56,462 _client.py:1025] HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/google/gemma-4-E2B-it/b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf/processor_config.json "HTTP/1.1 200 OK"
2026-04-28T03:02:31.6176740Z [INFO 2026-04-28 03:01:56,535 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/video_preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6177590Z [INFO 2026-04-28 03:01:56,682 _client.py:1025] HTTP Request: HEAD https://huggingface.co/google/gemma-4-E2B-it/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
2026-04-28T03:02:31.6178360Z [INFO 2026-04-28 03:01:56,685 run_llm_hf.py:60] Loaded processor from HuggingFace: google/gemma-4-E2B-it
2026-04-28T03:02:31.6178850Z [INFO 2026-04-28 03:01:56,685 run_llm_hf.py:103] Loading model from /tmp/gemma4-e2b.pte...
2026-04-28T03:02:31.6179260Z [INFO 2026-04-28 03:01:56,705 run_llm_hf.py:108] Model input_ids max seq len: 511
2026-04-28T03:02:31.6179740Z [INFO 2026-04-28 03:02:23,619 run_llm_hf.py:112] Encoding prompt: '\''What is the capital of France?'\''
2026-04-28T03:02:31.6180200Z [INFO 2026-04-28 03:02:23,718 run_llm_hf.py:121] Input shape: torch.Size([1, 16])
2026-04-28T03:02:31.6180620Z [INFO 2026-04-28 03:02:23,719 run_llm_hf.py:139] Running full-prompt prefill (16 tokens)...
2026-04-28T03:02:31.6181100Z [cpuinfo_utils.cpp:71] Reading file /sys/devices/soc0/image_version
2026-04-28T03:02:31.6181500Z [cpuinfo_utils.cpp:87] Failed to open midr file /sys/devices/soc0/image_version
2026-04-28T03:02:31.6181880Z [INFO 2026-04-28 03:02:27,735 run_llm_hf.py:145] Prefill time: 4.017s (4.0 tokens/sec)
2026-04-28T03:02:31.6182360Z [INFO 2026-04-28 03:02:27,738 run_llm_hf.py:156] Generating up to 50 tokens...
2026-04-28T03:02:31.6182600Z 
2026-04-28T03:02:31.6182680Z Prefill time: 4.017s (4.0 tok/s)
2026-04-28T03:02:31.6182920Z Decode time:  2.137s (50 tokens, 23.4 tok/s)
2026-04-28T03:02:31.6183100Z 
2026-04-28T03:02:31.6183190Z ============================================================
2026-04-28T03:02:31.6183450Z Generated text:
2026-04-28T03:02:31.6183680Z ============================================================
2026-04-28T03:02:31.6183900Z The
2026-04-28T03:02:31.6184040Z ============================================================'
2026-04-28T03:02:31.6184300Z + echo 'Failed: Expected '\''Paris'\'' not found in output'
2026-04-28T03:02:31.6184560Z Failed: Expected 'Paris' not found in output
2026-04-28T03:02:31.6184770Z + exit 1
2026-04-28T03:02:31.6184930Z Traceback (most recent call last):
2026-04-28T03:02:31.6185340Z   File "/Users/runner/work/executorch/executorch/test-infra/.github/scripts/run_with_env_secrets.py", line 102, in <module>
2026-04-28T03:02:31.6185790Z     main()
2026-04-28T03:02:31.6185970Z     ~~~~^^
2026-04-28T03:02:31.6186360Z   File "/Users/runner/work/executorch/executorch/test-infra/.github/scripts/run_with_env_secrets.py", line 61, in main
2026-04-28T03:02:31.6186860Z     run_cmd_or_die(f"bash {os.environ.get('RUNNER_TEMP', '')}/exec_script")
2026-04-28T03:02:31.6187170Z     ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-04-28T03:02:31.6187620Z   File "/Users/runner/work/executorch/executorch/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
2026-04-28T03:02:31.6188150Z     raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
2026-04-28T03:02:31.6188600Z RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1
2026-04-28T03:02:31.6402700Z ##[error]Process completed with exit code 1.
2026-04-28T03:02:31.6839100Z ##[group]Run pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1
2026-04-28T03:02:31.6839440Z with:
2026-04-28T03:02:31.6839610Z   path: /Users/runner/work/_temp/test-results
2026-04-28T03:02:31.6839830Z   fail-on-empty: false
2026-04-28T03:02:31.6839990Z env:
2026-04-28T03:02:31.6840130Z   REPOSITORY: pytorch/executorch
2026-04-28T03:02:31.6845030Z   SCRIPT: set -eux

metascroy · 2026-04-28T18:41:54Z

+        if [ "${MODEL_ID}" = "google/gemma-4-E2B-it" ]; then
+          # Gemma 4 requires a newer Transformers build than the CI-wide
+          # optimum-executorch pin currently brings in.
+          ${CONDA_RUN} pip install -U "transformers @ git+https://github.com/huggingface/transformers.git"


Can we pin on something specific? Whatever version you pin on, add to README under gemma4 section.

zeel2104 · 2026-04-28T20:53:41Z

@metascroy

The Gemma 4 failure changed after the last CI fix, export and runtime now work, but output quality regressed on a floating transformers HEAD.

I pinned Gemma 4 to the transformers commit I validated locally:
61461a7bcb458db7cf6eeea49678b9ab776a7821

I added the same pin to the README as well.

I haven’t touched the Qwen35-MoE threshold yet since that still looks separate.

metascroy · 2026-04-29T00:41:45Z

@zeel2104 it looks like the gemma4 test is failing in CI with:

2026-04-28T21:04:23.9629920Z Prefill time: 4.052s (3.9 tok/s)
2026-04-28T21:04:23.9630110Z Decode time:  2.477s (50 tokens, 20.2 tok/s)
2026-04-28T21:04:23.9630250Z 
2026-04-28T21:04:23.9630330Z ============================================================
2026-04-28T21:04:23.9630540Z Generated text:
2026-04-28T21:04:23.9630700Z ============================================================
2026-04-28T21:04:23.9630890Z The
2026-04-28T21:04:23.9631040Z ============================================================'
2026-04-28T21:04:23.9631230Z + grep -iq Paris
2026-04-28T21:04:23.9631430Z + echo 'Failed: Expected '\''Paris'\'' not found in output'
2026-04-28T21:04:23.9631670Z Failed: Expected 'Paris' not found in output
2026-04-28T21:04:23.9631890Z + exit 1
2026-04-28T21:04:24.0010380Z ##[error]Process completed with exit code 1.
2026-04-28T21:04:24.0663200Z ##[group]Run pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1
2026-04-28T21:04:24.0663610Z with:
2026-04-28T21:04:24.0663780Z   path: /Users/runner/work/_temp/test-results
2026-04-28T21:04:24.0663990Z   fail-on-empty: false
2026-04-28T21:04:24.0664140Z env:
2026-04-28T21:04:24.0664280Z   REPOSITORY: pytorch/executorch
2026-04-28T21:04:24.0669490Z   SCRIPT: set -eux

zeel2104 · 2026-04-29T12:28:45Z

I pushed one more Gemma 4 follow-up.

CI is getting through export and runtime now, so I updated run_llm_hf.py to prefer AutoTokenizer for this text-only flow and only fall back to AutoProcessor if needed.

I left the Qwen35-MoE threshold unchanged since that still looks separate.

metascroy · 2026-04-29T17:28:57Z

I pushed one more Gemma 4 follow-up.

CI is getting through export and runtime now, so I updated run_llm_hf.py to prefer AutoTokenizer for this text-only flow and only fall back to AutoProcessor if needed.

I left the Qwen35-MoE threshold unchanged since that still looks separate.

Yeah, you can ignore the Qwen3.5-MOE. It is unrelated.

Re-running CI with your latest changes.

zeel2104 · 2026-04-29T20:58:21Z

Again the gemma4 CI failed, it was due to moving dependencies in the validation path.

First, CI was using a transformers version that couldn’t load gemma4. After fixing that, CI was still pulling a newer Hugging Face model snapshot than the one I had validated locally.

To make the Gemma 4 path reproducible, I pinned both:

transformers commit: 61461a7bcb458db7cf6eeea49678b9ab776a7821
model revision: b4a601102c3d45e2b7b50e2057a6d5ec8ed4adcf

I also wired the model revision through the export/run scripts and added the same pins to the README.
Hopefully it works now

metascroy · 2026-04-29T22:33:18Z

            else:
                raise NotImplementedError(f"Support for input {arg} is not implemented")

+        placeholder_nodes = {


I don't follow this change.

Why is gemma4 sensistive to this?

I got here by diffing a previously working Gemma 4 .pte against a fresh export.

What changed there was the slot assignment for the two rotary constants used by sliding-window vs full attention. This change was just to make that assignment deterministic instead of depending on raw placeholder traversal order.

Gemma 4 is where I noticed it because that model exercises both constants in the same path.

If you’d prefer, I can drop this

zeel2104 · 2026-04-30T12:57:25Z

@metascroy
I tried a few different things to narrow this down. Export and runtime are working, and it doesn’t seem to be coming from the custom SDPA path, custom KV cache path, or the custom HF wrapper path anymore.

At this point it looks more like a Gemma 4 4w issue specifically. Would you like me to keep digging on 4w here, or should we narrow the Gemma 4 scope for now and handle 4w separately?

metascroy · 2026-04-30T16:43:17Z

@metascroy I tried a few different things to narrow this down. Export and runtime are working, and it doesn’t seem to be coming from the custom SDPA path, custom KV cache path, or the custom HF wrapper path anymore.

At this point it looks more like a Gemma 4 4w issue specifically. Would you like me to keep digging on 4w here, or should we narrow the Gemma 4 scope for now and handle 4w separately?

It would be good to get 4w working. Let me try checking out your PR today to see if I notice anything.

Add Gemma 4 MLX install-path support

5f455a2

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 23, 2026

zeel2104 marked this pull request as draft April 23, 2026 12:23

metascroy reviewed Apr 23, 2026

View reviewed changes

zeel2104 mentioned this pull request Apr 23, 2026

Good First Issue: Enable Gemma 4 on MLX Backend #18928

Open

Remove MLX runtime fallback and debug logging

0a822bd

zeel2104 marked this pull request as ready for review April 24, 2026 01:30

metascroy reviewed Apr 24, 2026

View reviewed changes

manuelcandales assigned metascroy Apr 27, 2026

Merge branch 'main' into gemma4-mlx-install-path

fd78741

metascroy added the ciflow/mlx label Apr 27, 2026

zeel2104 added 2 commits April 27, 2026 21:55

Support old and new HF cache interfaces in MLX custom cache

0e00290

Revert setup.py changes from Gemma 4 MLX PR

3a26baa

Use newer Transformers for Gemma 4 MLX CI

0bf5fc4

metascroy reviewed Apr 28, 2026

View reviewed changes

Pin Gemma 4 MLX CI to validated Transformers commit

90e5577

Prefer tokenizer for text-only Gemma 4 MLX runs

ee272c3

Pin Gemma 4 MLX flow to validated model revision

ca37250

zeel2104 added 2 commits April 29, 2026 17:36

Prefer HF early cache init for Gemma 4 MLX path

818a51d

Stabilize Gemma 4 MLX constant slot ordering

6e520dd

metascroy reviewed Apr 29, 2026

View reviewed changes

zeel2104 added 5 commits April 30, 2026 06:46

Keep Gemma 4 layer 22 down_proj in float for MLX export

391cde4

Use static cache for Gemma 4 MLX custom export

19d6f09

Disable custom SDPA for Gemma 4 MLX export

41e3a51

Disable custom cache path for Gemma 4 MLX export

9d3f841

Route Gemma 4 MLX export through optimum fallback path

719d2e8

		logger = logging.getLogger(__name__)


		def _iter_mlx_backend_candidates():

Conversation

zeel2104 commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Additional note:

Uh oh!

pytorch-bot Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19065

❌ 11 Awaiting Approval, 2 New Failures

Uh oh!

meta-cla Bot commented Apr 23, 2026

Action Required

Process

Uh oh!

github-actions Bot commented Apr 23, 2026

This PR needs a release notes: label

Uh oh!

meta-cla Bot commented Apr 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metascroy commented Apr 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zeel2104 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zeel2104 commented Apr 24, 2026

Uh oh!

zeel2104 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

metascroy commented Apr 24, 2026

Uh oh!

metascroy commented Apr 24, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zeel2104 commented Apr 26, 2026

Uh oh!

zeel2104 commented Apr 26, 2026

Uh oh!

zeel2104 commented Apr 26, 2026

Uh oh!

metascroy commented Apr 27, 2026

Uh oh!

metascroy commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

zeel2104 commented Apr 23, 2026 •

edited

Loading

pytorch-bot Bot commented Apr 23, 2026 •

edited

Loading

This PR needs a `release notes:` label

zeel2104 commented Apr 24, 2026 •

edited

Loading

zeel2104 commented Apr 24, 2026 •

edited

Loading

metascroy commented Apr 27, 2026 •

edited

Loading

metascroy commented Apr 28, 2026 •

edited

Loading

metascroy Apr 28, 2026 •

edited

Loading