Skip to content

feat(dpmodel): backend-agnostic DPA4 descriptor and fitting (core path)#5515

Merged
wanghan-iapcm merged 31 commits into
deepmodeling:masterfrom
wanghan-iapcm:feat-dpmodel-dpa4
Jun 12, 2026
Merged

feat(dpmodel): backend-agnostic DPA4 descriptor and fitting (core path)#5515
wanghan-iapcm merged 31 commits into
deepmodeling:masterfrom
wanghan-iapcm:feat-dpmodel-dpa4

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

What this PR does

Ports the core DPA4/SeZM descriptor and its GLU energy fitting from the PyTorch-only implementation (deepmd/pt, #5448/#5503) to the backend-agnostic dpmodel layer (numpy / array-API). This is PR 1 of a 3-PR series (next: pt_expt wrappers + training; then .pt2 freeze + Python/C++ inference).

Contents

  • deepmd/dpmodel/utils/lebedev.py + relocated lebedev_rules.npz (single source of truth; pt re-wraps it), deepmd/dpmodel/utils/spherical_harmonics.py — general-lmax real spherical harmonics in pure numpy, matching e3nn's normalize=True, normalization="norm" convention exactly (validated vs e3nn to ~1e-15), so dpmodel builds the S2-grid constants without e3nn/torch.
  • deepmd/dpmodel/descriptor/dpa4_nn/ — array-API ports of the sezm_nn math modules (indexing, utils, radial, norm, wignerd, so3, activation, projection, grid_net, so2, attention, embedding, edge_cache, ffn, block), structurally mirroring the pt code 1:1.
  • deepmd/dpmodel/descriptor/dpa4.pyDescrptDPA4, registered dpa4/DPA4/sezm/SeZM, serialization-compatible with pt DescrptSeZM (cross-deserialization tested in both directions).
  • deepmd/dpmodel/fitting/dpa4_ener.pySeZMEnergyFittingNet (GLU fitting), registered dpa4_ener/sezm_ener.

Design: padded edges instead of sparse nonzero edges

pt builds a sparse edge list via torch.nonzero and aggregates with index_add_/scatter. The dpmodel port uses a shape-static padded edge layout (E = nf*nloc*nnei, node-contiguous, validity mask): per-edge math is identical, node aggregations and the segment softmax become masked reductions over the nnei axis. This removes data-dependent shapes (torch.export-friendly for the upcoming pt_expt PR) and avoids non-standard scatter ops in array-API code. Masked slots carry a safe dummy direction before quaternion/Wigner evaluation so future autograd backward passes cannot be NaN-poisoned by dead slots.

Validation

  • Per-module weight-copy parity tests vs the pt classes (fp64, rtol 1e-12 / atol 1e-14) in source/tests/pt/model/test_dpa4_dpmodel_parity.py (377 tests), including dual-cache tests proving pt-sparse vs dp-padded equivalence with adversarial garbage in masked slots.
  • Descriptor-level end-to-end parity vs pt: max abs error ~6e-15; descriptor+fitting chained parity at rtol 1e-10.
  • Torch-free tests (source/tests/common/dpmodel/): serialize round-trips, permutation equivariance, rotation invariance, masked-edge inertness, guard coverage.
  • Cross-backend harness tests: source/tests/consistent/descriptor/test_dpa4.py, source/tests/consistent/fitting/test_dpa4_ener.py (dpa3-precedent tolerances; f64 atol stricter than precedent).
  • A model-dict contract test pins the pt dpa4 model serialize() top-level shape for the follow-up PRs.

Known limitations

  • Core path only. Out-of-scope features raise NotImplementedError naming the config key at construction/deserialize: spin/charge embedding (add_chg_spin_ebd), DeNS dual fitting, ZBL/inner-clamp bridging, multitask FiLM/case embedding (dim_case_embd>0), LoRA, SO3-grid paths (ffn_so3_grid, node_wise_*, message_node_*), grid_mlp, attention-residual modes, layer_scale, attention projections.
  • Lebedev quadrature only: lebedev_quadrature=False (e3nn product grid) is rejected; the default config uses Lebedev.
  • Wigner-D evaluates the closed form that pt's lstsq-fitted monomial kernels approximate; parity ≤1e-14 for the core lmax≤4 range, diverging O(1e-13) only at lmax≥9 (pt's own kernels drift from the closed form there).
  • Padded edges spend compute on masked slots (like dpa1/2/3); pt's sparse path remains faster for very sparse neighbor lists. random_gamma gauge augmentation is accepted and serialized but applied only when explicitly driven (numpy RNG; not stream-identical to torch's draw — gauge invariance is property-tested).
  • No model-level dpmodel assembly yet (energy model wiring, pt_expt wrappers, training, freeze, inference are PR-2/PR-3).
  • dpmodel numpy execution is a correctness reference, not a performance target; fp16/bf16 compute-dtype promotion points are not bit-mirrored (fp64/fp32 are).

Summary by CodeRabbit

  • New Features

    • Full DPA4/SeZM descriptor and building blocks added to the dpmodel backend (equivariant layers, radial/grid/Wigner support, and serialization/deserialize support).
    • DPA4 (SeZM) GLU-based energy fitting networks and network collection.
    • Packaged Lebedev quadrature loader and real spherical harmonics utilities.
  • Bug Fixes

    • Device/array alignment fixes to prevent backend/device mismatches.
  • Tests

    • Extensive torch-free and cross-backend tests for descriptor, fitting, Lebedev, and spherical-harmonics.

Han Wang added 21 commits June 11, 2026 09:14
@wanghan-iapcm wanghan-iapcm requested a review from OutisLi June 11, 2026 14:07
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)
  • source/tests/pt/model/test_dpa4_dpmodel_parity.py
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ddd9e1ba-2108-4e3a-b23d-7567ed33ab2c

📥 Commits

Reviewing files that changed from the base of the PR and between 70702ce and ed0e303.

📒 Files selected for processing (1)
  • source/tests/pt/model/test_dpa4_dpmodel_parity.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a dpmodel (array‑API) implementation of the DPA4/SeZM descriptor, equivariant SO(3)/SO(2) building blocks, S2/Lebedev projection and spherical‑harmonics utilities, GLU energy fitting nets, pt‑compatible serialize/deserialize, device-alignment fixes, and extensive unit and cross-backend tests.

Changes

DPA4 descriptor and fitting backend for dpmodel

Layer / File(s) Summary
Math, Lebedev, and spherical harmonics
deepmd/dpmodel/utils/lebedev.py, deepmd/dpmodel/utils/spherical_harmonics.py, deepmd/pt/model/descriptor/sezm_nn/lebedev.py
Packaged Lebedev rule loader, NumPy real spherical harmonics, and PT Lebedev loader refactored to reuse dpmodel utilities.
Radial & projection foundation
deepmd/dpmodel/descriptor/dpa4_nn/radial.py, deepmd/dpmodel/descriptor/dpa4_nn/projection.py, deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py
Radial bases and MLP, C³ cutoff envelope, Lebedev S2 projector, and quaternion→Wigner‑D calculator with zonal extraction and pt‑compatible persistence.
Edge cache & initial embeddings
deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py, deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
Padded-edge cache builder (E = nfnlocnnei), masked-safe geometry, deterministic gamma handling, Wigner outputs, and type/geometric/environment initial embeddings.
SO(3) core ops and utils
deepmd/dpmodel/descriptor/dpa4_nn/so3.py, deepmd/dpmodel/descriptor/dpa4_nn/utils.py, deepmd/dpmodel/descriptor/dpa4_nn/indexing.py
Focus/Channel/SO3 linear layers, numeric helpers (init, safe-norm), and SO(3) indexing/projection utilities.
Activation, norm, grid
deepmd/dpmodel/descriptor/dpa4_nn/activation.py, deepmd/dpmodel/descriptor/dpa4_nn/norm.py, deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
Gated/SwiGLU activations, RMS‑based equivariant norms (packed/reduced), S2 grid mixer/grid-net with serialization.
SO(2) message passing & attention
deepmd/dpmodel/descriptor/dpa4_nn/attention.py, deepmd/dpmodel/descriptor/dpa4_nn/so2.py
Destination-wise envelope-gated softmax and padded-edge SO(2) convolution (radial mixers, attention heads, aggregation), with guarded-not-ported flags.
Equivariant FFN & interaction block
deepmd/dpmodel/descriptor/dpa4_nn/ffn.py, deepmd/dpmodel/descriptor/dpa4_nn/block.py
SO(3)-equivariant FFN, optional S2-grid activation integration, and SeZM interaction-block stack with residuals and pt-compatible IO.
Descriptor surface & device fixes
deepmd/dpmodel/descriptor/dpa4.py, deepmd/dpmodel/descriptor/__init__.py, deepmd/dpmodel/utils/network.py
DescrptDPA4 registration/constructor/forward (edge cache → interaction blocks → readout), interface getters, serialize/deserialize, update_sel, and device-aligned parameter ops.
Fitting networks
deepmd/dpmodel/fitting/dpa4_ener.py, deepmd/dpmodel/fitting/__init__.py
GLU fitting networks per-type, SeZMNetworkCollection, auto-neuron resolution, registration as SeZMEnergyFittingNet, and serialization.
Tests
source/tests/common/dpmodel/*, source/tests/consistent/*
Torch-free unit tests for descriptor and fitting, Lebedev/spherical-harmonics tests, and DP/PT consistency test suites.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant DescrptDPA4
  participant EdgeCache
  participant SO2Convolution
  participant InteractionBlock
  participant EquivariantFFN
  Client->>DescrptDPA4: call(coord_ext, atype_ext, nlist, ...)
  DescrptDPA4->>EdgeCache: build_edge_cache(...)
  EdgeCache->>SO2Convolution: edge_cache, radial_feat
  SO2Convolution->>InteractionBlock: message-passing outputs
  InteractionBlock->>EquivariantFFN: block readout (l=0)
  EquivariantFFN->>DescrptDPA4: scalar mixing -> descriptor
  DescrptDPA4->>Client: return (descriptor, None, None, None, None)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

Core, Docs, Examples

Suggested reviewers

  • iProzd
  • njzjz
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment thread source/tests/common/dpmodel/test_lebedev_sh.py Fixed
Comment thread source/tests/consistent/descriptor/test_dpa4.py Fixed
Comment thread source/tests/consistent/descriptor/test_dpa4.py Fixed
Comment thread source/tests/consistent/descriptor/test_dpa4.py Fixed
Comment thread source/tests/consistent/descriptor/test_dpa4.py Fixed
Comment thread source/tests/consistent/fitting/test_dpa4_ener.py Fixed
Comment thread source/tests/consistent/fitting/test_dpa4_ener.py Fixed
@wanghan-iapcm wanghan-iapcm requested a review from njzjz June 11, 2026 14:10

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py`:
- Around line 202-205: The final validity mask updates src_local_safe but leaves
nlist_safe containing pre-filtered neighbor indices, which can leak out-of-range
indices into neighbor_coord_index and cause xp.take to fail; after computing
mask = mask & (src_local >= 0) & (src_local < nloc) and creating src_local_safe,
also zero out entries in nlist_safe where mask is false (e.g., nlist_safe =
xp.where(mask, nlist_safe, xp.zeros_like(nlist_safe))) so nlist_safe is aligned
with the final mask and safe when mapping is None before downstream use in
neighbor_coord_index/xp.take.

In `@deepmd/dpmodel/descriptor/dpa4_nn/embedding.py`:
- Around line 155-160: The flattened index array (index) must be range-validated
before calling xp.take to prevent negative atype values from selecting the last
row silently; in the embedding extraction code that uses
self.adam_type_embedding, compute the flattened index from atype (as currently
done) and explicitly check for any index < 0 or index >= allowed_max (use
self.ntypes or the appropriate max id for this embedding including padding if
applicable) and raise a clear error if any are out of range, only then call
xp.take and reshape to (*atype.shape, self.embed_dim).

In `@deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py`:
- Around line 519-533: The constructor S2GridNet currently sets
grid_method="e3nn" which triggers S2GridProjector to reject that backend; change
the default value of the grid_method parameter in S2GridNet(...) from "e3nn" to
"lebedev" and make the identical default change for the grid_method parameter in
the public constructor/function in projection.py (where S2GridProjector is used)
so the default path uses the implemented Lebedev backend; update any related
docstring/default mentions to match "lebedev".

In `@deepmd/dpmodel/descriptor/dpa4_nn/norm.py`:
- Around line 215-217: Rename the single-letter loop variable l to a descriptive
name (e.g., degree) to satisfy Ruff E741: replace occurrences of "for l in
range(self.lmax + 1):" with "for degree in range(self.lmax + 1):" and update
uses of l inside the loop (e.g., w = scale / (2 * l + 1) and (2 * l + 1)) to use
degree instead; do the same for the other similar loop (the one that also
extends weights_list) so variables like weights_list, self.lmax, and scale
remain unchanged while only the loop variable is renamed.

In `@deepmd/dpmodel/descriptor/dpa4_nn/radial.py`:
- Around line 383-387: The stored self.adam_freqs is a NumPy array but
RadialBasis.call() multiplies it with backend arrays from build_edge_cache(),
causing backend/device mismatches; convert adam_freqs into the active array
namespace before runtime arithmetic by calling the array_api_compat backend's
asarray with the correct device (e.g., xp.asarray(freqs, device=device)) where
adam_freqs is set or right before use in RadialBasis.call(); update
creation/assignment of self.adam_freqs (and/or its usage in RadialBasis.call())
to ensure it is an xp array compatible with r/edge_len so r * freqs and r -
freqs use the same backend.

In `@deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py`:
- Around line 245-249: The helper currently re-regularizes a caller-provided
edge_len (doing xp.sqrt(edge_len*edge_len + eps*eps)), which double-applies the
eps floor; change the logic so edge_len is computed with safe_norm(edge_vec,
eps) only when None and otherwise left unchanged. In other words, in the block
around edge_vec/edge_unit (wignerd.py) remove the else re-regularization and
simply use the passed-in edge_len; keep using safe_norm(edge_vec, eps) when
edge_len is None so callers (e.g. build_edge_cache) that already passed
safe_norm are not perturbed.

In `@deepmd/dpmodel/descriptor/dpa4.py`:
- Around line 374-376: Rename the single-letter loop/comprehension variable l to
a descriptive name (e.g., degree) wherever it's used with self.l_schedule to
avoid E741; specifically update the comprehensions that assign self.ebed_dims =
[get_so3_dim_of_lmax(l) for l in self.l_schedule] and self.rad_sizes_per_block =
[l + 1 for l in self.l_schedule], and any loops or comprehensions inside
_init_node_l_schedules that use l (also at the other occurrences around the
711-716 and 725 regions) so they use degree (or similar) instead.
- Around line 496-500: The film_scale_strength_log and film_shift_strength_log
are initialized as NumPy arrays but later used with xp.exp inside use_env_seed
(xp may be non-NumPy), causing backend/device mismatch; update the code so these
logs are converted to the active backend/device before calling xp.exp — either
initialize them using the active xp (when available) or, inside use_env_seed
before xp.exp(self.film_scale_strength_log[...]) /
xp.exp(self.film_shift_strength_log[...]), call xp.asarray(...) or xp.array(...)
(preserving dtype) to move the data onto xp/device; reference symbols:
film_scale_strength_log, film_shift_strength_log, use_env_seed, xp.exp.

In `@deepmd/dpmodel/fitting/dpa4_ener.py`:
- Around line 240-253: The _convert_key function currently uses asserts and
allows negative or out-of-range keys (leading to unintended negative indexing);
replace asserts with explicit validations: if key is int, check isinstance(key,
int) and 0 <= key < self.ntypes**self.ndim and raise ValueError on failure; if
key is str, parse to tuple but then validate the tuple length equals self.ndim
and each element is an int in range 0 <= tt < self.ntypes; if key is tuple,
validate same length and per-element bounds and types; raise TypeError for wrong
types and ValueError for out-of-range values so negative indices cannot slip
through and assertions are not relied upon.

In `@deepmd/dpmodel/utils/lebedev.py`:
- Line 68: The code currently coerces precision with int(precision) when
building rule_key which silently truncates invalid values (e.g., 11.9); change
this to validate and reject non-integer precision instead: in the function where
rule_key is created (the variable rule_key that uses f"{int(precision):03d}"),
assert that precision is an instance of int (or raise a ValueError/TypeError
with a clear message) before formatting, and then build rule_key with
f"{precision:03d}" so only true integer precisions are accepted.

In `@deepmd/dpmodel/utils/spherical_harmonics.py`:
- Around line 66-67: Guard against scalar inputs before indexing by checking the
rank of vecs first: convert vecs to an array if not already (np.asarray(vecs))
and use vecs.ndim (or np.ndim(vecs)) to detect scalars; if vecs.ndim == 0 (or
otherwise <1) or if vecs.shape[-1] != 3 then raise the existing ValueError.
Update the validation around the existing vecs check so scalars consistently
trigger the same ValueError instead of causing an IndexError.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6517a5c6-fe0b-4fdb-9145-25ca001d8bb4

📥 Commits

Reviewing files that changed from the base of the PR and between 890e38a and 80b4ba1.

📒 Files selected for processing (30)
  • deepmd/dpmodel/descriptor/__init__.py
  • deepmd/dpmodel/descriptor/dpa4.py
  • deepmd/dpmodel/descriptor/dpa4_nn/__init__.py
  • deepmd/dpmodel/descriptor/dpa4_nn/activation.py
  • deepmd/dpmodel/descriptor/dpa4_nn/attention.py
  • deepmd/dpmodel/descriptor/dpa4_nn/block.py
  • deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py
  • deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
  • deepmd/dpmodel/descriptor/dpa4_nn/ffn.py
  • deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
  • deepmd/dpmodel/descriptor/dpa4_nn/indexing.py
  • deepmd/dpmodel/descriptor/dpa4_nn/norm.py
  • deepmd/dpmodel/descriptor/dpa4_nn/projection.py
  • deepmd/dpmodel/descriptor/dpa4_nn/radial.py
  • deepmd/dpmodel/descriptor/dpa4_nn/so2.py
  • deepmd/dpmodel/descriptor/dpa4_nn/so3.py
  • deepmd/dpmodel/descriptor/dpa4_nn/utils.py
  • deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py
  • deepmd/dpmodel/fitting/__init__.py
  • deepmd/dpmodel/fitting/dpa4_ener.py
  • deepmd/dpmodel/utils/lebedev.py
  • deepmd/dpmodel/utils/lebedev_rules.npz
  • deepmd/dpmodel/utils/spherical_harmonics.py
  • deepmd/pt/model/descriptor/sezm_nn/lebedev.py
  • source/tests/common/dpmodel/test_descrpt_dpa4.py
  • source/tests/common/dpmodel/test_dpa4_ener.py
  • source/tests/common/dpmodel/test_lebedev_sh.py
  • source/tests/consistent/descriptor/test_dpa4.py
  • source/tests/consistent/fitting/test_dpa4_ener.py
  • source/tests/pt/model/test_dpa4_dpmodel_parity.py

Comment thread deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py
Comment thread deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
Comment thread deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py Outdated
Comment thread deepmd/dpmodel/descriptor/dpa4_nn/norm.py
Comment thread deepmd/dpmodel/descriptor/dpa4_nn/norm.py
Comment thread deepmd/dpmodel/descriptor/dpa4.py
Comment thread deepmd/dpmodel/descriptor/dpa4.py
Comment thread deepmd/dpmodel/fitting/dpa4_ener.py
Comment thread deepmd/dpmodel/utils/lebedev.py
Comment thread deepmd/dpmodel/utils/spherical_harmonics.py Outdated
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.68493% with 121 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.19%. Comparing base (890e38a) to head (ed0e303).

Files with missing lines Patch % Lines
deepmd/dpmodel/descriptor/dpa4.py 90.12% 47 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/so2.py 97.60% 16 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/projection.py 90.78% 13 Missing ⚠️
deepmd/dpmodel/fitting/dpa4_ener.py 94.04% 10 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/block.py 96.42% 7 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/norm.py 97.10% 7 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py 98.65% 6 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py 97.60% 5 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/so3.py 97.77% 4 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/radial.py 98.08% 3 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5515      +/-   ##
==========================================
+ Coverage   81.52%   82.19%   +0.67%     
==========================================
  Files         872      891      +19     
  Lines       97964   101600    +3636     
  Branches     4241     4240       -1     
==========================================
+ Hits        79865    83511    +3646     
+ Misses      16795    16787       -8     
+ Partials     1304     1302       -2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread deepmd/dpmodel/fitting/dpa4_ener.py Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/dpmodel/descriptor/dpa4_nn/so2.py`:
- Around line 394-397: Several raw NumPy buffers are not being aligned to the
current device; wrap each offending buffer with xp.asarray(..., device=device)
where they are materialized: ensure so2_linear.bias0 (the bias0 reshape block in
the SO(2) bias correction), AtomExcludeMask.type_mask, and NativeLayer.idt are
converted via xp.asarray(original_array, device=device) before any
reshape/assignment/usage so the backend (xp) handles device placement
consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9d0405b9-fbf2-4d0a-b117-606668787ddc

📥 Commits

Reviewing files that changed from the base of the PR and between 80b4ba1 and 93e4d0b.

📒 Files selected for processing (20)
  • deepmd/dpmodel/descriptor/dpa4.py
  • deepmd/dpmodel/descriptor/dpa4_nn/activation.py
  • deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py
  • deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
  • deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
  • deepmd/dpmodel/descriptor/dpa4_nn/norm.py
  • deepmd/dpmodel/descriptor/dpa4_nn/projection.py
  • deepmd/dpmodel/descriptor/dpa4_nn/radial.py
  • deepmd/dpmodel/descriptor/dpa4_nn/so2.py
  • deepmd/dpmodel/descriptor/dpa4_nn/so3.py
  • deepmd/dpmodel/fitting/dpa4_ener.py
  • deepmd/dpmodel/utils/exclude_mask.py
  • deepmd/dpmodel/utils/lebedev.py
  • deepmd/dpmodel/utils/network.py
  • deepmd/dpmodel/utils/spherical_harmonics.py
  • source/tests/common/dpmodel/test_dpa4_ener.py
  • source/tests/common/dpmodel/test_lebedev_sh.py
  • source/tests/consistent/descriptor/test_dpa4.py
  • source/tests/consistent/fitting/test_dpa4_ener.py
  • source/tests/pt/model/test_dpa4_dpmodel_parity.py
💤 Files with no reviewable changes (2)
  • source/tests/consistent/descriptor/test_dpa4.py
  • source/tests/consistent/fitting/test_dpa4_ener.py
🚧 Files skipped from review as they are similar to previous changes (11)
  • deepmd/dpmodel/utils/spherical_harmonics.py
  • deepmd/dpmodel/utils/lebedev.py
  • source/tests/common/dpmodel/test_dpa4_ener.py
  • source/tests/common/dpmodel/test_lebedev_sh.py
  • deepmd/dpmodel/descriptor/dpa4_nn/projection.py
  • deepmd/dpmodel/descriptor/dpa4_nn/radial.py
  • deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
  • deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
  • deepmd/dpmodel/descriptor/dpa4_nn/so3.py
  • deepmd/dpmodel/descriptor/dpa4_nn/norm.py
  • deepmd/dpmodel/fitting/dpa4_ener.py

Comment thread deepmd/dpmodel/descriptor/dpa4_nn/so2.py
@wanghan-iapcm wanghan-iapcm enabled auto-merge June 11, 2026 16:45
@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Jun 11, 2026
@wanghan-iapcm wanghan-iapcm removed this pull request from the merge queue due to a manual request Jun 11, 2026
@wanghan-iapcm

Copy link
Copy Markdown
Collaborator Author

pre-commit.ci run

@wanghan-iapcm wanghan-iapcm enabled auto-merge June 11, 2026 23:07
@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Jun 12, 2026
Merged via the queue into deepmodeling:master with commit 0de53e9 Jun 12, 2026
70 checks passed
@wanghan-iapcm wanghan-iapcm deleted the feat-dpmodel-dpa4 branch June 12, 2026 08:42
zhaiwenxi pushed a commit to zhaiwenxi/deepmd-kit that referenced this pull request Jun 15, 2026
…odeling#5522)

PR-2 of the DPA4/SeZM porting series (PR-1: deepmodeling#5515, dpmodel core). Makes
DPA4 trainable and torch.export-compatible on the pt_expt backend.

## What's included

- **Descriptor wrapper** (`deepmd/pt_expt/descriptor/dpa4.py`): thin
`@torch_module` `DescrptDPA4` registered as `SeZM`/`sezm`/`DPA4`/`dpa4`,
plus explicit converter registrations for the two dpmodel classes that
cannot be auto-wrapped via serialize round-trip: `WignerDCalculator`
(deserialize raises by design — rebuilt from `lmax`/`eps`/`precision`)
and `SwiGLU` (parameter-free, no serialize).
- **Trainable-weight promotion**: dpmodel stores DPA4 weights as bare
numpy arrays (mirroring pt SeZM parameter names), which the generic
ndarray rule registers as non-trainable buffers. A localized
`_TRAINABLE_ATTRS` table re-registers exactly the attributes that are
`nn.Parameter` in the pt SeZM implementation, after
`__init__`/`deserialize`. Two supporting infra fixes:
`_try_convert_list` now handles Optional-module lists
(`SO2Convolution.non_linearities` ends with `None`), and
`xp_asarray_nodetach` (`deepmd/dpmodel/array_api.py`) replaces
`xp.asarray` on weight attributes in dpa4 dpmodel code so torch autograd
is not silently detached.
- **Fitting wrapper** (`deepmd/pt_expt/fitting/dpa4_ener.py`):
`SeZMEnergyFittingNet` (`dpa4_ener`/`sezm_ener`), `GLUFittingNet`, and a
`SeZMNetworkCollection` wrapper mirroring the `NetworkCollection`
ModuleDict pattern so GLU weights are optimizer-visible.
- **Model assembly** (`deepmd/pt_expt/model/get_model.py`):
`get_sezm_model` dispatched for model type `dpa4`/`sezm`/`SeZM`/`DPA4`,
mirroring pt's config handling (descriptor/fitting type defaults,
`pair_exclude_types` reconciliation). Fail-fast `NotImplementedError`
guards for pt-only extensions: spin, `bridging_method != none`, `lora`,
`use_compile`, `preset_out_bias`. `enable_tf32` is accepted and ignored
(pt_expt always runs "highest" matmul precision; documented).

## Tests

- pt_expt trio (`source/tests/pt_expt/descriptor/test_dpa4.py`):
consistency (serde round-trip + dpmodel reference),
`torch.export.export`, `make_fx` through forward + `autograd.grad`;
fitting analogs in `source/tests/pt_expt/fitting/test_dpa4_ener.py`.
- **Parameter-gradient parity vs pt**
(`source/tests/pt/model/test_dpa4_ptexpt_grad_parity.py`): weight-copied
pt SeZM vs pt_expt, quadratic loss, every parameter's gradient compared
1:1 via serialize-tree alignment (83/83 parameters, no dead parameters,
max deviation ~3e-11 rel from fp64 summation order).
- Model assembly tests
(`source/tests/pt_expt/model/test_get_model_dpa4.py`): aliases,
defaults, exclude-types branches, all NIE guards.
- Training end-to-end
(`source/tests/pt_expt/test_training.py::test_training_loop_dpa4`) on
the water fixture through the full trainer.
- Cross-backend consistency rows enabled for pt_expt
(`source/tests/consistent/`): 20 new comparisons pass at existing
harness tolerances.
- PR-1 parity suites stay green (381 passed) including for the small
dpmodel fixes this PR needed (serialize via `to_numpy_array`,
explicit-dtype zeros, ParameterList-safe `_load_variables`,
`charge_spin` kwarg for atomic-model interface compatibility).

## Known limitations

- No freeze/`.pt2` export, no Python `DeepEval`, no C++/LAMMPS inference
— that is PR-3 of the series.
- No multitask: `share_params` raises `NotImplementedError`.
- No model-dict deserialization of pt `SeZMModel` checkpoints (PR-3).
- pt-only SeZM extensions fail fast rather than being ported: spin,
bridging/ZBL, LoRA, `use_compile`, `preset_out_bias`; `enable_tf32`
ignored (numerically conservative).
- `_try_convert_list`'s ParameterList branch does not respect
`trainable=False` for list-valued weights (pre-existing backend-wide
behavior).
- CUDA validated at tolerance level only in CI; performance vs the pt
Triton implementation is unbenchmarked.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added PyTorch-experimental DPA4/SeZM descriptor and energy-fitting
components, with public package exports.
* Added EnergyModel factory/dispatch to build DPA4/SeZM models from
configs.

* **Improvements**
* Enhanced DPA4/SeZM runtime handling to keep gradient connectivity and
ensure device-consistent tensor placement.
* Updated DPA4 descriptor interface to accept optional `charge_spin` for
compatibility.

* **Tests**
* Expanded experimental coverage with serialization/parity checks, FX
tracing/export validation, trainable-parameter behavior tests, and
descriptor/fitting gradient parity verification.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
njzjz pushed a commit to njzjz/deepmd-kit that referenced this pull request Jun 17, 2026
…eepmodeling#5540)

PR-3 (final) of the DPA4/SeZM porting series — pt_expt **inference**:
freeze to `.pt2`, Python `DeepEval`, pt→pt_expt checkpoint interop, C++
single-rank, and LAMMPS single-rank. Follows PR-1 (deepmodeling#5515, dpmodel core)
and PR-2 (deepmodeling#5522, pt_expt training/export).

## What's included

- **Model freeze to `.pt2`** (`deepmd/pt_expt/model/ener_model.py`,
`deepmd/dpmodel/.../ener_model.py`,
`deepmd/dpmodel/descriptor/dpa4.py`): register `EnergyModel` under
`sezm_ener`/`dpa4_ener` model-type aliases so `BaseModel.deserialize`
resolves a standard DPA4 energy model (whose fitting type is
`sezm_ener`). Fixed a `torch.export` specialization where `int()` on
symbolic shapes baked `nf*nloc` (embedding/so2/attention).
- **NoPBC export fix** (`deepmd/dpmodel/descriptor/dpa4.py`): the
`atype_ext[:, :nloc]` slice emitted a spurious `Ne(nall, nloc)` shape
guard that crashed the compiled artifact when `nall==nloc` (no ghosts);
replaced with `xp_take_first_n` (index_select). NoPBC now matches PBC.
- **pt→pt_expt checkpoint interop** (`deepmd/pt_expt/model/model.py`):
`BaseModel.deserialize` unwraps pt's bespoke `SeZMModel` serialization
(`type:"SeZM"`, nested `sezm_atomic` atomic model with the pt-only dens
head), validates versions, rejects unsupported features
(bridging/lora/dens/active_mode) with `NotImplementedError`, and
delegates to the standard path.
- **Warn on silently-ignored flags** (`use_amp` descriptor,
`enable_tf32` model): warn-once instead of silent drop.

## Tests

- **Model freeze** `source/tests/pt_expt/model/test_dpa4_export.py`:
dual-artifact `.pt2`, metadata, AOTI load, artifact-vs-eager parity
(1e-10). *(CI-skipped — AOTI is slow; run locally.)*
- **DeepEval parity vs pt**
`source/tests/pt_expt/infer/test_dpa4_deep_eval.py`: pt `.pt` vs pt_expt
`.pt2` energy/force/global-virial/atom-energy at fp64 1e-10, **PBC and
NoPBC**; doubles as the checkpoint-interop proof. Per-atom virial
compared by sum (pt's edge-scatter from deepmodeling#5518 redistributes it; global
virial matches). *(CI-skipped — AOTI.)*
- **Interop unit tests**
`source/tests/pt_expt/model/test_dpa4_interop.py` (CI-runnable, no
AOTI): happy-path pt-checkpoint→pt_expt round-trip + every guard branch
+ version validation + `@variables` filtering.
- **Alias deserialize guard** + **use_amp/enable_tf32 warn-once** tests
(CI-runnable).
- **Fixture generator** `source/tests/infer/gen_dpa4.py` (+ wired into
`source/install/test_cc_local.sh`).
- **C++ single-rank** `source/api_cc/tests/test_deeppot_dpa4_ptexpt.cc`:
20 tests (double+float), dpa3-matched tolerances. Validated locally.
- **LAMMPS single-rank** `source/lmp/tests/test_lammps_dpa4_pt2.py`:
parity + `atom_modify map yes` + the deepmodeling#5450 no-atom-map fail-fast.
**Validated on a GPU box (7 passed).**

PR-1 parity suites stay green; the small dpmodel edits are
parity-revalidated.

## Known limitations

- **Single-rank only.** Multi-rank/MPI LAMMPS for DPA4 is deferred (no
live multi-rank cell; the with-comm artifact compiles but its runtime is
not exercised). DPA4 is a message-passing descriptor, so multi-rank
follows the existing deepmodeling#5450/deepmodeling#5430 machinery in a later PR.
- **No `.pth` (torch.jit) DPA4** — the pt backend has no `sezm_ener`
*model* registration, so `.pth` freeze of a standard DPA4 energy model
isn't available; not needed for the pt_expt inference path.
- **Per-atom virial** is not compared element-wise pt-vs-pt_expt (only
its global sum) — deepmodeling#5518 changed pt's edge-scatter distribution; both are
correct, the distribution differs.
- **AOTI tests are CI-skipped** (multi-minute compile) — the
freeze/DeepEval paths are validated locally, not in CI; the
interop/alias/warn tests give CI coverage of the non-AOTI logic.
- **fp64 only**; fp32 freeze untested. CUDA validated at LAMMPS level on
a GPU box; the AOTI parity numbers are from CPU fp64.
- **`use_amp`/`enable_tf32`** remain functionally ignored (now warned) —
by design for this series.
- pt SeZM features out of scope (guarded `NotImplementedError`): spin,
ZBL bridging, LoRA, dens/direct-force/denoising heads, SO3 grid
projection, GridMLP, SO(2) attention extensions.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

# Release Notes

* **New Features**
* Enabled DPA4 model inference via the pt_expt backend using
dual-artifact compilation.
* Registered the EnergyModel under additional aliases: `sezm_ener` and
`dpa4_ener`.

* **Improvements**
* Improved dynamic/symbolic shape handling across DPA4 components for
export/tracing stability.
* Enhanced pt SeZM/DPA4 checkpoint deserialization and normalization for
interoperability.
* Added one-time warnings when `use_amp` or `enable_tf32` settings are
ineffective.

* **Tests**
* Added C++ and Python coverage for pt2 inference, LAMMPS integration,
model export/freeze, parity, interop, and warning behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@coderabbitai coderabbitai Bot mentioned this pull request Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants