fix: initialize `embeddings_pre_norm_masked=false` in `llama_context` by abetlen · Pull Request #23256 · ggml-org/llama.cpp

abetlen · 2026-05-18T08:40:27Z

Overview

This PR fixes a bug introduced in #23198 by the new embeddings_pre_norm_masked struct member for llama_context. When left uninitialised embeddings_pre_norm_masked caused a bug in the construction of Qwen3.5 graphs where get_rows_f32 failed in an assert because it tried to grab an invalid row index.

Additional information

Failing CI run with the relevant assert

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: yes, gpt 5.5 xhigh was used through the codex cli to find the root cause of this bug when the CI job failed.

ggerganov · 2026-05-18T11:21:04Z

Thanks @abetlen!

* master: (100 commits) Agent update hexagon: add support for TRI op (ggml-org#22822) ggml-hexagon: add PAD op HVX kernel (ggml-org#23078) docker : add OCI image labels for version and build date (ggml-org#21653) common : remove hf cache migration (ggml-org#23266) ui: Update KaTeX package and clean up logs from `sass` warnings (ggml-org#23275) feat: add scroll-to-bottom button to chat + prevent forced scroll down (ggml-org#23270) ui: Refactor models store, MCP service, and gate logs behind VITE_DEBUG (ggml-org#23236) ui: Centralize monospace font styles in app.css (ggml-org#23272) webui: fix Tailwind v4 utility classes missing when built via cmake (ggml-org#23253) llama: initialize pre-norm embedding mask flag (ggml-org#23256) add myself to conversion (ggml-org#23261) ci : added kleidiai-server to server-self-hosted workflow (ggml-org#22435) scripts : allow wc2wt with an existing branch (ggml-org#23189) sycl: scalar SWAR byte-subtract in Q6_K MMVQ dot product (ggml-org#22156) sycl: route small f32 matmuls to oneMKL, bypass oneDNN (ggml-org#22150) sycl : fix error when use -mg 1 error (ggml-org#23140) update bid to match each layers MTP source (ggml-org#23237) cmake : do not check for bin install dir (ggml-org#23234) feat: Support d_conv=15 for ssm-conv.cu (ggml-org#23017) ...

- Known issue: v326 vulkan cpy bf16->f32 SIGSEGV on GFX1103 PHOENIX (remediation pending) - v326: vulkan BF16 copy pipelines (mainline PR ggml-org#22677 cherry-pick) - v325: pre-norm embedding mask init fix (mainline PR ggml-org#23256 cherry-pick) - v324: Vulkan BF16 FA dispatch via inline uvec2 dequant (COOPMAT1 path) - v323: KV cache CPU fallback for types lacking GPU SET_ROWS support

llama: initialize pre-norm embedding mask flag

e022161

abetlen requested a review from ggerganov as a code owner May 18, 2026 08:40

am17an approved these changes May 18, 2026

View reviewed changes

ggerganov approved these changes May 18, 2026

View reviewed changes

ggerganov merged commit 49c21f9 into ggml-org:master May 18, 2026
45 of 49 checks passed

kgrama pushed a commit to kgrama/llama.cpp that referenced this pull request May 19, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

8151b30

xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 19, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

15743bf

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

9ac1386

ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request May 19, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

9570433

fhnmor21 pushed a commit to fhnmor21/llama-cpp-turboquant that referenced this pull request May 19, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

a09ef01

dbrain pushed a commit to dbrain/hbd-llama-cpp-turboquant that referenced this pull request May 21, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

42a0b6a

baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

15b14f0

srossitto79 pushed a commit to srossitto79/llama.cpp that referenced this pull request May 23, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

63857e0

a-ghorbani mentioned this pull request May 24, 2026

chore(deps): upgrade llama.rn to 0.12.3 a-ghorbani/pocketpal-ai#740

Merged

7 tasks

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

llama: initialize pre-norm embedding mask flag (ggml-org#23256)

785390e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: initialize `embeddings_pre_norm_masked=false` in `llama_context`#23256

fix: initialize `embeddings_pre_norm_masked=false` in `llama_context`#23256
ggerganov merged 1 commit into
ggml-org:masterfrom
abetlen:fix/qwen35-pre-norm-mask-init

abetlen commented May 18, 2026

Uh oh!

Uh oh!

ggerganov commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

abetlen commented May 18, 2026

Overview

Additional information

Requirements

Uh oh!

Uh oh!

ggerganov commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants