fix: initialize embeddings_pre_norm_masked=false in llama_context#23256
Merged
Conversation
am17an
approved these changes
May 18, 2026
ggerganov
approved these changes
May 18, 2026
Member
|
Thanks @abetlen! |
Jcfunk
added a commit
to Jcfunk/llama.cpp
that referenced
this pull request
May 19, 2026
* master: (100 commits) Agent update hexagon: add support for TRI op (ggml-org#22822) ggml-hexagon: add PAD op HVX kernel (ggml-org#23078) docker : add OCI image labels for version and build date (ggml-org#21653) common : remove hf cache migration (ggml-org#23266) ui: Update KaTeX package and clean up logs from `sass` warnings (ggml-org#23275) feat: add scroll-to-bottom button to chat + prevent forced scroll down (ggml-org#23270) ui: Refactor models store, MCP service, and gate logs behind VITE_DEBUG (ggml-org#23236) ui: Centralize monospace font styles in app.css (ggml-org#23272) webui: fix Tailwind v4 utility classes missing when built via cmake (ggml-org#23253) llama: initialize pre-norm embedding mask flag (ggml-org#23256) add myself to conversion (ggml-org#23261) ci : added kleidiai-server to server-self-hosted workflow (ggml-org#22435) scripts : allow wc2wt with an existing branch (ggml-org#23189) sycl: scalar SWAR byte-subtract in Q6_K MMVQ dot product (ggml-org#22156) sycl: route small f32 matmuls to oneMKL, bypass oneDNN (ggml-org#22150) sycl : fix error when use -mg 1 error (ggml-org#23140) update bid to match each layers MTP source (ggml-org#23237) cmake : do not check for bin install dir (ggml-org#23234) feat: Support d_conv=15 for ssm-conv.cu (ggml-org#23017) ...
kgrama
pushed a commit
to kgrama/llama.cpp
that referenced
this pull request
May 19, 2026
xxmustafacooTR
pushed a commit
to xxPlayground/llama-cpp-turboquant
that referenced
this pull request
May 19, 2026
rsenthilkumar6
pushed a commit
to rsenthilkumar6/llama.cpp
that referenced
this pull request
May 19, 2026
ArberSephirotheca
pushed a commit
to ArberSephirotheca/llama.cpp
that referenced
this pull request
May 19, 2026
fhnmor21
pushed a commit
to fhnmor21/llama-cpp-turboquant
that referenced
this pull request
May 19, 2026
jimbothigpen
added a commit
to jimbothigpen/llama.cpp
that referenced
this pull request
May 21, 2026
- Known issue: v326 vulkan cpy bf16->f32 SIGSEGV on GFX1103 PHOENIX (remediation pending) - v326: vulkan BF16 copy pipelines (mainline PR ggml-org#22677 cherry-pick) - v325: pre-norm embedding mask init fix (mainline PR ggml-org#23256 cherry-pick) - v324: Vulkan BF16 FA dispatch via inline uvec2 dequant (COOPMAT1 path) - v323: KV cache CPU fallback for types lacking GPU SET_ROWS support
dbrain
pushed a commit
to dbrain/hbd-llama-cpp-turboquant
that referenced
this pull request
May 21, 2026
baramofme
pushed a commit
to baramofme/llama-cpp-turboquant
that referenced
this pull request
May 23, 2026
srossitto79
pushed a commit
to srossitto79/llama.cpp
that referenced
this pull request
May 23, 2026
7 tasks
jimbothigpen
added a commit
to jimbothigpen/llama.cpp
that referenced
this pull request
May 25, 2026
- Known issue: v326 vulkan cpy bf16->f32 SIGSEGV on GFX1103 PHOENIX (remediation pending) - v326: vulkan BF16 copy pipelines (mainline PR ggml-org#22677 cherry-pick) - v325: pre-norm embedding mask init fix (mainline PR ggml-org#23256 cherry-pick) - v324: Vulkan BF16 FA dispatch via inline uvec2 dequant (COOPMAT1 path) - v323: KV cache CPU fallback for types lacking GPU SET_ROWS support
jimbothigpen
added a commit
to jimbothigpen/llama.cpp
that referenced
this pull request
May 25, 2026
- Known issue: v326 vulkan cpy bf16->f32 SIGSEGV on GFX1103 PHOENIX (remediation pending) - v326: vulkan BF16 copy pipelines (mainline PR ggml-org#22677 cherry-pick) - v325: pre-norm embedding mask init fix (mainline PR ggml-org#23256 cherry-pick) - v324: Vulkan BF16 FA dispatch via inline uvec2 dequant (COOPMAT1 path) - v323: KV cache CPU fallback for types lacking GPU SET_ROWS support
fewtarius
pushed a commit
to fewtarius/llama.cpp
that referenced
this pull request
May 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR fixes a bug introduced in #23198 by the new
embeddings_pre_norm_maskedstruct member forllama_context. When left uninitialisedembeddings_pre_norm_maskedcaused a bug in the construction of Qwen3.5 graphs whereget_rows_f32failed in an assert because it tried to grab an invalid row index.Additional information
Failing CI run with the relevant assert
Requirements