Skip to content

[ML] Add EuroBERT/Jina v5 ops to graph validation allowlist#3015

Draft
edsavage wants to merge 3 commits intoelastic:mainfrom
edsavage:feature/jina-v5-ops
Draft

[ML] Add EuroBERT/Jina v5 ops to graph validation allowlist#3015
edsavage wants to merge 3 commits intoelastic:mainfrom
edsavage:feature/jina-v5-ops

Conversation

@edsavage
Copy link
Copy Markdown
Contributor

Summary

Adds 4 ops required by the Jina Embeddings v5 model architecture (EuroBERT + LoRA adapters):

Op Used for
aten::sin Rotary position embeddings (RoPE) — sine component
aten::cos Rotary position embeddings (RoPE) — cosine component
aten::rsqrt RMSNorm (EuroBERT uses RMSNorm instead of LayerNorm)
aten::silu SiLU/Swish activation (EuroBERT uses SiLU instead of GELU)

Required for elastic/eland#818 which adds support for importing Jina v5 models into Elasticsearch.

Ops verified by tracing jinaai/jina-embeddings-v5-text-nano with merged LoRA retrieval adapter locally.

Cross-repo dependencies

Test plan

  • CI passes (allowlist drift test)
  • Add Jina v5 to validation_models.json and reference_model_ops.json
  • Verify traced model passes graph validation locally

Made with Cursor

Jina Embeddings v5 is based on EuroBERT, which uses a different
architecture from the BERT family:
- RoPE (rotary position embeddings) → aten::sin, aten::cos
- RMSNorm (instead of LayerNorm) → aten::rsqrt
- SiLU activation (instead of GELU) → aten::silu

Required for Eland PR elastic/eland#818 which adds support for
importing Jina v5 models into Elasticsearch.

Made-with: Cursor
@prodsecmachine
Copy link
Copy Markdown

prodsecmachine commented Mar 29, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

aten::sin and aten::cos are now in the allowlist (needed by
EuroBERT/Jina v5 for rotary position embeddings), so tests that
used them as example "unrecognised" ops now fail.

- Replace torch.sin with torch.logit in synthetic test modules
- Update malicious model tests to check for ops that remain
  unrecognised (aten::tan, aten::exp) rather than sin/cos

Made-with: Cursor
…logit)

Regenerate malicious_hidden_in_submodule.pt with aten::logit+clamp so
graph validation still fails when aten::sin is allowed for EuroBERT/Jina.
Update dev-tools/generate_malicious_models.py and test comments.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants