Skip to content

[ML] Add EuroBERT/Jina v5 ops and fix graph validator tests#3016

Closed
edsavage wants to merge 2 commits intoelastic:mainfrom
edsavage:fix/jina-v5-ops-and-test-fixes
Closed

[ML] Add EuroBERT/Jina v5 ops and fix graph validator tests#3016
edsavage wants to merge 2 commits intoelastic:mainfrom
edsavage:fix/jina-v5-ops-and-test-fixes

Conversation

@edsavage
Copy link
Copy Markdown
Contributor

@edsavage edsavage commented Apr 1, 2026

Closes #2890

Summary

Adds 4 ops required by the Jina Embeddings v5 model architecture (EuroBERT + LoRA adapters):

Op Used for
aten::sin Rotary position embeddings (RoPE) — sine component
aten::cos Rotary position embeddings (RoPE) — cosine component
aten::rsqrt RMSNorm (EuroBERT uses RMSNorm instead of LayerNorm)
aten::silu SiLU/Swish activation (EuroBERT uses SiLU instead of GELU)

Also fixes graph validator tests that used aten::sin as the canonical "unrecognised op" example — since sin is now in the allowlist, these tests were failing:

  • Synthetic tests: replaced torch.sin with torch.logit
  • Malicious model tests: check for aten::tan/aten::exp (still unrecognised) instead of sin/cos

Ops verified by tracing jinaai/jina-embeddings-v5-text-nano with merged LoRA retrieval adapter locally.

Cross-repo dependencies

  • Eland: elastic/eland#818 (Jina v5 import support)
  • Elasticsearch Java: byte_level_bpe tokenization type + BPE merges vocabulary format support (TBD)

Test plan

  • CI passes (all graph validator tests + allowlist drift test)
  • Verify traced Jina v5 model passes graph validation locally

Made with Cursor

edsavage added 2 commits April 2, 2026 10:41
Jina Embeddings v5 is based on EuroBERT, which uses a different
architecture from the BERT family:
- RoPE (rotary position embeddings) → aten::sin, aten::cos
- RMSNorm (instead of LayerNorm) → aten::rsqrt
- SiLU activation (instead of GELU) → aten::silu

Required for Eland PR elastic/eland#818 which adds support for
importing Jina v5 models into Elasticsearch.

Made-with: Cursor
aten::sin and aten::cos are now in the allowlist (needed by
EuroBERT/Jina v5 for rotary position embeddings), so tests that
used them as example "unrecognised" ops now fail.

- Replace torch.sin with torch.logit in synthetic test modules
- Update malicious model tests to check for ops that remain
  unrecognised (aten::tan, aten::exp) rather than sin/cos

Made-with: Cursor
@prodsecmachine
Copy link
Copy Markdown

prodsecmachine commented Apr 1, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@edsavage edsavage closed this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test CIoManagerTest/testFileIoGood is flaky on linux-x86_64.

2 participants