feat(dpmodel): backend-agnostic DPA4 descriptor and fitting (core path) by wanghan-iapcm · Pull Request #5515 · deepmodeling/deepmd-kit

wanghan-iapcm · 2026-06-11T14:04:54Z

What this PR does

Ports the core DPA4/SeZM descriptor and its GLU energy fitting from the PyTorch-only implementation (deepmd/pt, #5448/#5503) to the backend-agnostic dpmodel layer (numpy / array-API). This is PR 1 of a 3-PR series (next: pt_expt wrappers + training; then .pt2 freeze + Python/C++ inference).

deepmd/dpmodel/utils/lebedev.py + relocated lebedev_rules.npz (single source of truth; pt re-wraps it), deepmd/dpmodel/utils/spherical_harmonics.py — general-lmax real spherical harmonics in pure numpy, matching e3nn's normalize=True, normalization="norm" convention exactly (validated vs e3nn to ~1e-15), so dpmodel builds the S2-grid constants without e3nn/torch.
deepmd/dpmodel/descriptor/dpa4_nn/ — array-API ports of the sezm_nn math modules (indexing, utils, radial, norm, wignerd, so3, activation, projection, grid_net, so2, attention, embedding, edge_cache, ffn, block), structurally mirroring the pt code 1:1.
deepmd/dpmodel/descriptor/dpa4.py — DescrptDPA4, registered dpa4/DPA4/sezm/SeZM, serialization-compatible with pt DescrptSeZM (cross-deserialization tested in both directions).
deepmd/dpmodel/fitting/dpa4_ener.py — SeZMEnergyFittingNet (GLU fitting), registered dpa4_ener/sezm_ener.

Design: padded edges instead of sparse `nonzero` edges

pt builds a sparse edge list via torch.nonzero and aggregates with index_add_/scatter. The dpmodel port uses a shape-static padded edge layout (E = nf*nloc*nnei, node-contiguous, validity mask): per-edge math is identical, node aggregations and the segment softmax become masked reductions over the nnei axis. This removes data-dependent shapes (torch.export-friendly for the upcoming pt_expt PR) and avoids non-standard scatter ops in array-API code. Masked slots carry a safe dummy direction before quaternion/Wigner evaluation so future autograd backward passes cannot be NaN-poisoned by dead slots.

Validation

Per-module weight-copy parity tests vs the pt classes (fp64, rtol 1e-12 / atol 1e-14) in source/tests/pt/model/test_dpa4_dpmodel_parity.py (377 tests), including dual-cache tests proving pt-sparse vs dp-padded equivalence with adversarial garbage in masked slots.
Descriptor-level end-to-end parity vs pt: max abs error ~6e-15; descriptor+fitting chained parity at rtol 1e-10.
Torch-free tests (source/tests/common/dpmodel/): serialize round-trips, permutation equivariance, rotation invariance, masked-edge inertness, guard coverage.
Cross-backend harness tests: source/tests/consistent/descriptor/test_dpa4.py, source/tests/consistent/fitting/test_dpa4_ener.py (dpa3-precedent tolerances; f64 atol stricter than precedent).
A model-dict contract test pins the pt dpa4 model serialize() top-level shape for the follow-up PRs.

Known limitations

Core path only. Out-of-scope features raise NotImplementedError naming the config key at construction/deserialize: spin/charge embedding (add_chg_spin_ebd), DeNS dual fitting, ZBL/inner-clamp bridging, multitask FiLM/case embedding (dim_case_embd>0), LoRA, SO3-grid paths (ffn_so3_grid, node_wise_*, message_node_*), grid_mlp, attention-residual modes, layer_scale, attention projections.
Lebedev quadrature only: lebedev_quadrature=False (e3nn product grid) is rejected; the default config uses Lebedev.
Wigner-D evaluates the closed form that pt's lstsq-fitted monomial kernels approximate; parity ≤1e-14 for the core lmax≤4 range, diverging O(1e-13) only at lmax≥9 (pt's own kernels drift from the closed form there).
Padded edges spend compute on masked slots (like dpa1/2/3); pt's sparse path remains faster for very sparse neighbor lists. random_gamma gauge augmentation is accepted and serialized but applied only when explicitly driven (numpy RNG; not stream-identical to torch's draw — gauge invariance is property-tested).
No model-level dpmodel assembly yet (energy model wiring, pt_expt wrappers, training, freeze, inference are PR-2/PR-3).
dpmodel numpy execution is a correctness reference, not a performance target; fp16/bf16 compute-dtype promotion points are not bit-mirrored (fp64/fp32 are).

Summary by CodeRabbit

New Features
- Full DPA4/SeZM descriptor and building blocks added to the dpmodel backend (equivariant layers, radial/grid/Wigner support, and serialization/deserialize support).
- DPA4 (SeZM) GLU-based energy fitting networks and network collection.
- Packaged Lebedev quadrature loader and real spherical harmonics utilities.
Bug Fixes
- Device/array alignment fixes to prevent backend/device mismatches.
Tests
- Extensive torch-free and cross-backend tests for descriptor, fitting, Lebedev, and spherical-harmonics.

…onvention tests

…index anchor

…robustness

…egation

…pe note

…ntract

for more information, see https://pre-commit.ci

coderabbitai · 2026-06-11T14:08:15Z

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)

source/tests/pt/model/test_dpa4_dpmodel_parity.py

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ddd9e1ba-2108-4e3a-b23d-7567ed33ab2c

📥 Commits

Reviewing files that changed from the base of the PR and between 70702ce and ed0e303.

📒 Files selected for processing (1)

source/tests/pt/model/test_dpa4_dpmodel_parity.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a dpmodel (array‑API) implementation of the DPA4/SeZM descriptor, equivariant SO(3)/SO(2) building blocks, S2/Lebedev projection and spherical‑harmonics utilities, GLU energy fitting nets, pt‑compatible serialize/deserialize, device-alignment fixes, and extensive unit and cross-backend tests.

Changes

DPA4 descriptor and fitting backend for dpmodel

Layer / File(s)	Summary
Math, Lebedev, and spherical harmonics `deepmd/dpmodel/utils/lebedev.py`, `deepmd/dpmodel/utils/spherical_harmonics.py`, `deepmd/pt/model/descriptor/sezm_nn/lebedev.py`	Packaged Lebedev rule loader, NumPy real spherical harmonics, and PT Lebedev loader refactored to reuse dpmodel utilities.
Radial & projection foundation `deepmd/dpmodel/descriptor/dpa4_nn/radial.py`, `deepmd/dpmodel/descriptor/dpa4_nn/projection.py`, `deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py`	Radial bases and MLP, C³ cutoff envelope, Lebedev S2 projector, and quaternion→Wigner‑D calculator with zonal extraction and pt‑compatible persistence.
Edge cache & initial embeddings `deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py`, `deepmd/dpmodel/descriptor/dpa4_nn/embedding.py`	Padded-edge cache builder (E = nfnlocnnei), masked-safe geometry, deterministic gamma handling, Wigner outputs, and type/geometric/environment initial embeddings.
SO(3) core ops and utils `deepmd/dpmodel/descriptor/dpa4_nn/so3.py`, `deepmd/dpmodel/descriptor/dpa4_nn/utils.py`, `deepmd/dpmodel/descriptor/dpa4_nn/indexing.py`	Focus/Channel/SO3 linear layers, numeric helpers (init, safe-norm), and SO(3) indexing/projection utilities.
Activation, norm, grid `deepmd/dpmodel/descriptor/dpa4_nn/activation.py`, `deepmd/dpmodel/descriptor/dpa4_nn/norm.py`, `deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py`	Gated/SwiGLU activations, RMS‑based equivariant norms (packed/reduced), S2 grid mixer/grid-net with serialization.
SO(2) message passing & attention `deepmd/dpmodel/descriptor/dpa4_nn/attention.py`, `deepmd/dpmodel/descriptor/dpa4_nn/so2.py`	Destination-wise envelope-gated softmax and padded-edge SO(2) convolution (radial mixers, attention heads, aggregation), with guarded-not-ported flags.
Equivariant FFN & interaction block `deepmd/dpmodel/descriptor/dpa4_nn/ffn.py`, `deepmd/dpmodel/descriptor/dpa4_nn/block.py`	SO(3)-equivariant FFN, optional S2-grid activation integration, and SeZM interaction-block stack with residuals and pt-compatible IO.
Descriptor surface & device fixes `deepmd/dpmodel/descriptor/dpa4.py`, `deepmd/dpmodel/descriptor/__init__.py`, `deepmd/dpmodel/utils/network.py`	DescrptDPA4 registration/constructor/forward (edge cache → interaction blocks → readout), interface getters, serialize/deserialize, update_sel, and device-aligned parameter ops.
Fitting networks `deepmd/dpmodel/fitting/dpa4_ener.py`, `deepmd/dpmodel/fitting/__init__.py`	GLU fitting networks per-type, SeZMNetworkCollection, auto-neuron resolution, registration as SeZMEnergyFittingNet, and serialization.
Tests `source/tests/common/dpmodel/`, `source/tests/consistent/`	Torch-free unit tests for descriptor and fitting, Lebedev/spherical-harmonics tests, and DP/PT consistency test suites.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant DescrptDPA4
  participant EdgeCache
  participant SO2Convolution
  participant InteractionBlock
  participant EquivariantFFN
  Client->>DescrptDPA4: call(coord_ext, atype_ext, nlist, ...)
  DescrptDPA4->>EdgeCache: build_edge_cache(...)
  EdgeCache->>SO2Convolution: edge_cache, radial_feat
  SO2Convolution->>InteractionBlock: message-passing outputs
  InteractionBlock->>EquivariantFFN: block readout (l=0)
  EquivariantFFN->>DescrptDPA4: scalar mixing -> descriptor
  DescrptDPA4->>Client: return (descriptor, None, None, None, None)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

deepmodeling/deepmd-kit#5430: Related threading of comm_dict through descriptor forward paths and multi-rank exchange hooks.

Suggested labels

Core, Docs, Examples

Suggested reviewers

iProzd
njzjz

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

coderabbitai

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py`:
- Around line 202-205: The final validity mask updates src_local_safe but leaves
nlist_safe containing pre-filtered neighbor indices, which can leak out-of-range
indices into neighbor_coord_index and cause xp.take to fail; after computing
mask = mask & (src_local >= 0) & (src_local < nloc) and creating src_local_safe,
also zero out entries in nlist_safe where mask is false (e.g., nlist_safe =
xp.where(mask, nlist_safe, xp.zeros_like(nlist_safe))) so nlist_safe is aligned
with the final mask and safe when mapping is None before downstream use in
neighbor_coord_index/xp.take.

In `@deepmd/dpmodel/descriptor/dpa4_nn/embedding.py`:
- Around line 155-160: The flattened index array (index) must be range-validated
before calling xp.take to prevent negative atype values from selecting the last
row silently; in the embedding extraction code that uses
self.adam_type_embedding, compute the flattened index from atype (as currently
done) and explicitly check for any index < 0 or index >= allowed_max (use
self.ntypes or the appropriate max id for this embedding including padding if
applicable) and raise a clear error if any are out of range, only then call
xp.take and reshape to (*atype.shape, self.embed_dim).

In `@deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py`:
- Around line 519-533: The constructor S2GridNet currently sets
grid_method="e3nn" which triggers S2GridProjector to reject that backend; change
the default value of the grid_method parameter in S2GridNet(...) from "e3nn" to
"lebedev" and make the identical default change for the grid_method parameter in
the public constructor/function in projection.py (where S2GridProjector is used)
so the default path uses the implemented Lebedev backend; update any related
docstring/default mentions to match "lebedev".

In `@deepmd/dpmodel/descriptor/dpa4_nn/norm.py`:
- Around line 215-217: Rename the single-letter loop variable l to a descriptive
name (e.g., degree) to satisfy Ruff E741: replace occurrences of "for l in
range(self.lmax + 1):" with "for degree in range(self.lmax + 1):" and update
uses of l inside the loop (e.g., w = scale / (2 * l + 1) and (2 * l + 1)) to use
degree instead; do the same for the other similar loop (the one that also
extends weights_list) so variables like weights_list, self.lmax, and scale
remain unchanged while only the loop variable is renamed.

In `@deepmd/dpmodel/descriptor/dpa4_nn/radial.py`:
- Around line 383-387: The stored self.adam_freqs is a NumPy array but
RadialBasis.call() multiplies it with backend arrays from build_edge_cache(),
causing backend/device mismatches; convert adam_freqs into the active array
namespace before runtime arithmetic by calling the array_api_compat backend's
asarray with the correct device (e.g., xp.asarray(freqs, device=device)) where
adam_freqs is set or right before use in RadialBasis.call(); update
creation/assignment of self.adam_freqs (and/or its usage in RadialBasis.call())
to ensure it is an xp array compatible with r/edge_len so r * freqs and r -
freqs use the same backend.

In `@deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py`:
- Around line 245-249: The helper currently re-regularizes a caller-provided
edge_len (doing xp.sqrt(edge_len*edge_len + eps*eps)), which double-applies the
eps floor; change the logic so edge_len is computed with safe_norm(edge_vec,
eps) only when None and otherwise left unchanged. In other words, in the block
around edge_vec/edge_unit (wignerd.py) remove the else re-regularization and
simply use the passed-in edge_len; keep using safe_norm(edge_vec, eps) when
edge_len is None so callers (e.g. build_edge_cache) that already passed
safe_norm are not perturbed.

In `@deepmd/dpmodel/descriptor/dpa4.py`:
- Around line 374-376: Rename the single-letter loop/comprehension variable l to
a descriptive name (e.g., degree) wherever it's used with self.l_schedule to
avoid E741; specifically update the comprehensions that assign self.ebed_dims =
[get_so3_dim_of_lmax(l) for l in self.l_schedule] and self.rad_sizes_per_block =
[l + 1 for l in self.l_schedule], and any loops or comprehensions inside
_init_node_l_schedules that use l (also at the other occurrences around the
711-716 and 725 regions) so they use degree (or similar) instead.
- Around line 496-500: The film_scale_strength_log and film_shift_strength_log
are initialized as NumPy arrays but later used with xp.exp inside use_env_seed
(xp may be non-NumPy), causing backend/device mismatch; update the code so these
logs are converted to the active backend/device before calling xp.exp — either
initialize them using the active xp (when available) or, inside use_env_seed
before xp.exp(self.film_scale_strength_log[...]) /
xp.exp(self.film_shift_strength_log[...]), call xp.asarray(...) or xp.array(...)
(preserving dtype) to move the data onto xp/device; reference symbols:
film_scale_strength_log, film_shift_strength_log, use_env_seed, xp.exp.

In `@deepmd/dpmodel/fitting/dpa4_ener.py`:
- Around line 240-253: The _convert_key function currently uses asserts and
allows negative or out-of-range keys (leading to unintended negative indexing);
replace asserts with explicit validations: if key is int, check isinstance(key,
int) and 0 <= key < self.ntypes**self.ndim and raise ValueError on failure; if
key is str, parse to tuple but then validate the tuple length equals self.ndim
and each element is an int in range 0 <= tt < self.ntypes; if key is tuple,
validate same length and per-element bounds and types; raise TypeError for wrong
types and ValueError for out-of-range values so negative indices cannot slip
through and assertions are not relied upon.

In `@deepmd/dpmodel/utils/lebedev.py`:
- Line 68: The code currently coerces precision with int(precision) when
building rule_key which silently truncates invalid values (e.g., 11.9); change
this to validate and reject non-integer precision instead: in the function where
rule_key is created (the variable rule_key that uses f"{int(precision):03d}"),
assert that precision is an instance of int (or raise a ValueError/TypeError
with a clear message) before formatting, and then build rule_key with
f"{precision:03d}" so only true integer precisions are accepted.

In `@deepmd/dpmodel/utils/spherical_harmonics.py`:
- Around line 66-67: Guard against scalar inputs before indexing by checking the
rank of vecs first: convert vecs to an array if not already (np.asarray(vecs))
and use vecs.ndim (or np.ndim(vecs)) to detect scalars; if vecs.ndim == 0 (or
otherwise <1) or if vecs.shape[-1] != 3 then raise the existing ValueError.
Update the validation around the existing vecs check so scalars consistently
trigger the same ValueError instead of causing an IndexError.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6517a5c6-fe0b-4fdb-9145-25ca001d8bb4

📥 Commits

Reviewing files that changed from the base of the PR and between 890e38a and 80b4ba1.

📒 Files selected for processing (30)

deepmd/dpmodel/descriptor/__init__.py
deepmd/dpmodel/descriptor/dpa4.py
deepmd/dpmodel/descriptor/dpa4_nn/__init__.py
deepmd/dpmodel/descriptor/dpa4_nn/activation.py
deepmd/dpmodel/descriptor/dpa4_nn/attention.py
deepmd/dpmodel/descriptor/dpa4_nn/block.py
deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py
deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
deepmd/dpmodel/descriptor/dpa4_nn/ffn.py
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
deepmd/dpmodel/descriptor/dpa4_nn/indexing.py
deepmd/dpmodel/descriptor/dpa4_nn/norm.py
deepmd/dpmodel/descriptor/dpa4_nn/projection.py
deepmd/dpmodel/descriptor/dpa4_nn/radial.py
deepmd/dpmodel/descriptor/dpa4_nn/so2.py
deepmd/dpmodel/descriptor/dpa4_nn/so3.py
deepmd/dpmodel/descriptor/dpa4_nn/utils.py
deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py
deepmd/dpmodel/fitting/__init__.py
deepmd/dpmodel/fitting/dpa4_ener.py
deepmd/dpmodel/utils/lebedev.py
deepmd/dpmodel/utils/lebedev_rules.npz
deepmd/dpmodel/utils/spherical_harmonics.py
deepmd/pt/model/descriptor/sezm_nn/lebedev.py
source/tests/common/dpmodel/test_descrpt_dpa4.py
source/tests/common/dpmodel/test_dpa4_ener.py
source/tests/common/dpmodel/test_lebedev_sh.py
source/tests/consistent/descriptor/test_dpa4.py
source/tests/consistent/fitting/test_dpa4_ener.py
source/tests/pt/model/test_dpa4_dpmodel_parity.py

codecov · 2026-06-11T14:57:51Z

Codecov Report

❌ Patch coverage is 96.68493% with 121 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.19%. Comparing base (890e38a) to head (ed0e303).

Files with missing lines	Patch %	Lines
deepmd/dpmodel/descriptor/dpa4.py	90.12%	47 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/so2.py	97.60%	16 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/projection.py	90.78%	13 Missing ⚠️
deepmd/dpmodel/fitting/dpa4_ener.py	94.04%	10 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/block.py	96.42%	7 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/norm.py	97.10%	7 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py	98.65%	6 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py	97.60%	5 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/so3.py	97.77%	4 Missing ⚠️
deepmd/dpmodel/descriptor/dpa4_nn/radial.py	98.08%	3 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5515      +/-   ##
==========================================
+ Coverage   81.52%   82.19%   +0.67%     
==========================================
  Files         872      891      +19     
  Lines       97964   101600    +3636     
  Branches     4241     4240       -1     
==========================================
+ Hits        79865    83511    +3646     
+ Misses      16795    16787       -8     
+ Partials     1304     1302       -2

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…tion, lint)

…variance)

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/dpmodel/descriptor/dpa4_nn/so2.py`:
- Around line 394-397: Several raw NumPy buffers are not being aligned to the
current device; wrap each offending buffer with xp.asarray(..., device=device)
where they are materialized: ensure so2_linear.bias0 (the bias0 reshape block in
the SO(2) bias correction), AtomExcludeMask.type_mask, and NativeLayer.idt are
converted via xp.asarray(original_array, device=device) before any
reshape/assignment/usage so the backend (xp) handles device placement
consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9d0405b9-fbf2-4d0a-b117-606668787ddc

📥 Commits

Reviewing files that changed from the base of the PR and between 80b4ba1 and 93e4d0b.

📒 Files selected for processing (20)

deepmd/dpmodel/descriptor/dpa4.py
deepmd/dpmodel/descriptor/dpa4_nn/activation.py
deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py
deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
deepmd/dpmodel/descriptor/dpa4_nn/norm.py
deepmd/dpmodel/descriptor/dpa4_nn/projection.py
deepmd/dpmodel/descriptor/dpa4_nn/radial.py
deepmd/dpmodel/descriptor/dpa4_nn/so2.py
deepmd/dpmodel/descriptor/dpa4_nn/so3.py
deepmd/dpmodel/fitting/dpa4_ener.py
deepmd/dpmodel/utils/exclude_mask.py
deepmd/dpmodel/utils/lebedev.py
deepmd/dpmodel/utils/network.py
deepmd/dpmodel/utils/spherical_harmonics.py
source/tests/common/dpmodel/test_dpa4_ener.py
source/tests/common/dpmodel/test_lebedev_sh.py
source/tests/consistent/descriptor/test_dpa4.py
source/tests/consistent/fitting/test_dpa4_ener.py
source/tests/pt/model/test_dpa4_dpmodel_parity.py

💤 Files with no reviewable changes (2)

source/tests/consistent/descriptor/test_dpa4.py
source/tests/consistent/fitting/test_dpa4_ener.py

🚧 Files skipped from review as they are similar to previous changes (11)

deepmd/dpmodel/utils/spherical_harmonics.py
deepmd/dpmodel/utils/lebedev.py
source/tests/common/dpmodel/test_dpa4_ener.py
source/tests/common/dpmodel/test_lebedev_sh.py
deepmd/dpmodel/descriptor/dpa4_nn/projection.py
deepmd/dpmodel/descriptor/dpa4_nn/radial.py
deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
deepmd/dpmodel/descriptor/dpa4_nn/so3.py
deepmd/dpmodel/descriptor/dpa4_nn/norm.py
deepmd/dpmodel/fitting/dpa4_ener.py

…yError in collection lookup

…runners)

wanghan-iapcm · 2026-06-11T23:06:47Z

pre-commit.ci run

… pinning

…conditional tolerances

…odeling#5522) PR-2 of the DPA4/SeZM porting series (PR-1: deepmodeling#5515, dpmodel core). Makes DPA4 trainable and torch.export-compatible on the pt_expt backend. ## What's included - **Descriptor wrapper** (`deepmd/pt_expt/descriptor/dpa4.py`): thin `@torch_module` `DescrptDPA4` registered as `SeZM`/`sezm`/`DPA4`/`dpa4`, plus explicit converter registrations for the two dpmodel classes that cannot be auto-wrapped via serialize round-trip: `WignerDCalculator` (deserialize raises by design — rebuilt from `lmax`/`eps`/`precision`) and `SwiGLU` (parameter-free, no serialize). - **Trainable-weight promotion**: dpmodel stores DPA4 weights as bare numpy arrays (mirroring pt SeZM parameter names), which the generic ndarray rule registers as non-trainable buffers. A localized `_TRAINABLE_ATTRS` table re-registers exactly the attributes that are `nn.Parameter` in the pt SeZM implementation, after `__init__`/`deserialize`. Two supporting infra fixes: `_try_convert_list` now handles Optional-module lists (`SO2Convolution.non_linearities` ends with `None`), and `xp_asarray_nodetach` (`deepmd/dpmodel/array_api.py`) replaces `xp.asarray` on weight attributes in dpa4 dpmodel code so torch autograd is not silently detached. - **Fitting wrapper** (`deepmd/pt_expt/fitting/dpa4_ener.py`): `SeZMEnergyFittingNet` (`dpa4_ener`/`sezm_ener`), `GLUFittingNet`, and a `SeZMNetworkCollection` wrapper mirroring the `NetworkCollection` ModuleDict pattern so GLU weights are optimizer-visible. - **Model assembly** (`deepmd/pt_expt/model/get_model.py`): `get_sezm_model` dispatched for model type `dpa4`/`sezm`/`SeZM`/`DPA4`, mirroring pt's config handling (descriptor/fitting type defaults, `pair_exclude_types` reconciliation). Fail-fast `NotImplementedError` guards for pt-only extensions: spin, `bridging_method != none`, `lora`, `use_compile`, `preset_out_bias`. `enable_tf32` is accepted and ignored (pt_expt always runs "highest" matmul precision; documented). ## Tests - pt_expt trio (`source/tests/pt_expt/descriptor/test_dpa4.py`): consistency (serde round-trip + dpmodel reference), `torch.export.export`, `make_fx` through forward + `autograd.grad`; fitting analogs in `source/tests/pt_expt/fitting/test_dpa4_ener.py`. - **Parameter-gradient parity vs pt** (`source/tests/pt/model/test_dpa4_ptexpt_grad_parity.py`): weight-copied pt SeZM vs pt_expt, quadratic loss, every parameter's gradient compared 1:1 via serialize-tree alignment (83/83 parameters, no dead parameters, max deviation ~3e-11 rel from fp64 summation order). - Model assembly tests (`source/tests/pt_expt/model/test_get_model_dpa4.py`): aliases, defaults, exclude-types branches, all NIE guards. - Training end-to-end (`source/tests/pt_expt/test_training.py::test_training_loop_dpa4`) on the water fixture through the full trainer. - Cross-backend consistency rows enabled for pt_expt (`source/tests/consistent/`): 20 new comparisons pass at existing harness tolerances. - PR-1 parity suites stay green (381 passed) including for the small dpmodel fixes this PR needed (serialize via `to_numpy_array`, explicit-dtype zeros, ParameterList-safe `_load_variables`, `charge_spin` kwarg for atomic-model interface compatibility). ## Known limitations - No freeze/`.pt2` export, no Python `DeepEval`, no C++/LAMMPS inference — that is PR-3 of the series. - No multitask: `share_params` raises `NotImplementedError`. - No model-dict deserialization of pt `SeZMModel` checkpoints (PR-3). - pt-only SeZM extensions fail fast rather than being ported: spin, bridging/ZBL, LoRA, `use_compile`, `preset_out_bias`; `enable_tf32` ignored (numerically conservative). - `_try_convert_list`'s ParameterList branch does not respect `trainable=False` for list-valued weights (pre-existing backend-wide behavior). - CUDA validated at tolerance level only in CI; performance vs the pt Triton implementation is unbenchmarked.  ## Summary by CodeRabbit * **New Features** * Added PyTorch-experimental DPA4/SeZM descriptor and energy-fitting components, with public package exports. * Added EnergyModel factory/dispatch to build DPA4/SeZM models from configs. * **Improvements** * Enhanced DPA4/SeZM runtime handling to keep gradient connectivity and ensure device-consistent tensor placement. * Updated DPA4 descriptor interface to accept optional `charge_spin` for compatibility. * **Tests** * Expanded experimental coverage with serialization/parity checks, FX tracing/export validation, trainable-parameter behavior tests, and descriptor/fitting gradient parity verification.  --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…eepmodeling#5540) PR-3 (final) of the DPA4/SeZM porting series — pt_expt **inference**: freeze to `.pt2`, Python `DeepEval`, pt→pt_expt checkpoint interop, C++ single-rank, and LAMMPS single-rank. Follows PR-1 (deepmodeling#5515, dpmodel core) and PR-2 (deepmodeling#5522, pt_expt training/export). ## What's included - **Model freeze to `.pt2`** (`deepmd/pt_expt/model/ener_model.py`, `deepmd/dpmodel/.../ener_model.py`, `deepmd/dpmodel/descriptor/dpa4.py`): register `EnergyModel` under `sezm_ener`/`dpa4_ener` model-type aliases so `BaseModel.deserialize` resolves a standard DPA4 energy model (whose fitting type is `sezm_ener`). Fixed a `torch.export` specialization where `int()` on symbolic shapes baked `nf*nloc` (embedding/so2/attention). - **NoPBC export fix** (`deepmd/dpmodel/descriptor/dpa4.py`): the `atype_ext[:, :nloc]` slice emitted a spurious `Ne(nall, nloc)` shape guard that crashed the compiled artifact when `nall==nloc` (no ghosts); replaced with `xp_take_first_n` (index_select). NoPBC now matches PBC. - **pt→pt_expt checkpoint interop** (`deepmd/pt_expt/model/model.py`): `BaseModel.deserialize` unwraps pt's bespoke `SeZMModel` serialization (`type:"SeZM"`, nested `sezm_atomic` atomic model with the pt-only dens head), validates versions, rejects unsupported features (bridging/lora/dens/active_mode) with `NotImplementedError`, and delegates to the standard path. - **Warn on silently-ignored flags** (`use_amp` descriptor, `enable_tf32` model): warn-once instead of silent drop. ## Tests - **Model freeze** `source/tests/pt_expt/model/test_dpa4_export.py`: dual-artifact `.pt2`, metadata, AOTI load, artifact-vs-eager parity (1e-10). *(CI-skipped — AOTI is slow; run locally.)* - **DeepEval parity vs pt** `source/tests/pt_expt/infer/test_dpa4_deep_eval.py`: pt `.pt` vs pt_expt `.pt2` energy/force/global-virial/atom-energy at fp64 1e-10, **PBC and NoPBC**; doubles as the checkpoint-interop proof. Per-atom virial compared by sum (pt's edge-scatter from deepmodeling#5518 redistributes it; global virial matches). *(CI-skipped — AOTI.)* - **Interop unit tests** `source/tests/pt_expt/model/test_dpa4_interop.py` (CI-runnable, no AOTI): happy-path pt-checkpoint→pt_expt round-trip + every guard branch + version validation + `@variables` filtering. - **Alias deserialize guard** + **use_amp/enable_tf32 warn-once** tests (CI-runnable). - **Fixture generator** `source/tests/infer/gen_dpa4.py` (+ wired into `source/install/test_cc_local.sh`). - **C++ single-rank** `source/api_cc/tests/test_deeppot_dpa4_ptexpt.cc`: 20 tests (double+float), dpa3-matched tolerances. Validated locally. - **LAMMPS single-rank** `source/lmp/tests/test_lammps_dpa4_pt2.py`: parity + `atom_modify map yes` + the deepmodeling#5450 no-atom-map fail-fast. **Validated on a GPU box (7 passed).** PR-1 parity suites stay green; the small dpmodel edits are parity-revalidated. ## Known limitations - **Single-rank only.** Multi-rank/MPI LAMMPS for DPA4 is deferred (no live multi-rank cell; the with-comm artifact compiles but its runtime is not exercised). DPA4 is a message-passing descriptor, so multi-rank follows the existing deepmodeling#5450/deepmodeling#5430 machinery in a later PR. - **No `.pth` (torch.jit) DPA4** — the pt backend has no `sezm_ener` *model* registration, so `.pth` freeze of a standard DPA4 energy model isn't available; not needed for the pt_expt inference path. - **Per-atom virial** is not compared element-wise pt-vs-pt_expt (only its global sum) — deepmodeling#5518 changed pt's edge-scatter distribution; both are correct, the distribution differs. - **AOTI tests are CI-skipped** (multi-minute compile) — the freeze/DeepEval paths are validated locally, not in CI; the interop/alias/warn tests give CI coverage of the non-AOTI logic. - **fp64 only**; fp32 freeze untested. CUDA validated at LAMMPS level on a GPU box; the AOTI parity numbers are from CPU fp64. - **`use_amp`/`enable_tf32`** remain functionally ignored (now warned) — by design for this series. - pt SeZM features out of scope (guarded `NotImplementedError`): spin, ZBL bridging, LoRA, dens/direct-force/denoising heads, SO3 grid projection, GridMLP, SO(2) attention extensions.  ## Summary by CodeRabbit # Release Notes * **New Features** * Enabled DPA4 model inference via the pt_expt backend using dual-artifact compilation. * Registered the EnergyModel under additional aliases: `sezm_ener` and `dpa4_ener`. * **Improvements** * Improved dynamic/symbolic shape handling across DPA4 components for export/tracing stability. * Enhanced pt SeZM/DPA4 checkpoint deserialization and normalization for interoperability. * Added one-time warnings when `use_amp` or `enable_tf32` settings are ineffective. * **Tests** * Added C++ and Python coverage for pt2 inference, LAMMPS integration, model export/freeze, parity, interop, and warning behavior.  --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Han Wang added 21 commits June 11, 2026 09:14

feat(dpmodel): relocate Lebedev rules with numpy loader

03cc9ee

test(dpmodel): cover Lebedev loader error paths; review fixes

eccbe4d

feat(dpmodel): general-lmax real spherical harmonics (e3nn conventions)

ea0c471

fix(dpmodel): match e3nn zero-vector handling in real SH; e3nn-free c…

f8798e1

…onvention tests

feat(dpmodel): dpa4_nn indexing + utils

18dde4b

test(dpmodel): accurate parity-file docstring, import order, literal …

8a12493

…index anchor

feat(dpmodel): dpa4_nn radial basis, envelope, radial MLP

f51d986

feat(dpmodel): dpa4_nn Wigner-D calculator + edge quaternions

80fc4a8

feat(dpmodel): dpa4_nn norms, SO3 linear, gated activation

3916c21

test(dpmodel): tighten fp32 parity tolerances to measured ulp

dee01f5

feat(dpmodel): dpa4_nn S2 grid projection + grid nets (Lebedev)

23107df

test(dpmodel): keep fp32 parity tests tolerance-based for truncation …

3115285

…robustness

feat(dpmodel): dpa4_nn SO2 linear + convolution with padded-edge aggr…

66be7c3

…egation

docs(dpmodel): clarify EdgeCache padded-layout and cache-reuse contract

a89ce39

feat(dpmodel): dpa4_nn type, geometric, environment embeddings

2d6fa3a

fix(dpmodel): uniform precision-key lookup; honest GIE zonal test sco…

fab9bf3

…pe note

feat(dpmodel): dpa4_nn padded edge-cache builder

abc0574

feat(dpmodel): dpa4_nn equivariant FFN + interaction block

0c7c822

feat(dpmodel): DPA4 descriptor (core path)

c8ebada

feat(dpmodel): dpa4_ener GLU energy fitting

ce2e2a9

test(dpmodel): DPA4 cross-backend consistency tests and model-dict co…

b2bbfff

…ntract

dosubot Bot added enhancement new feature labels Jun 11, 2026

github-actions Bot added the Python label Jun 11, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

80b4ba1

for more information, see https://pre-commit.ci

wanghan-iapcm requested a review from OutisLi June 11, 2026 14:07

github-advanced-security AI found potential problems Jun 11, 2026

View reviewed changes

wanghan-iapcm requested a review from njzjz June 11, 2026 14:10

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

fix(dpmodel): address review findings (xp-namespace constants, valida…

93e4d0b

…tion, lint)

github-advanced-security AI found potential problems Jun 11, 2026

View reviewed changes

Comment thread deepmd/dpmodel/fitting/dpa4_ener.py Fixed

test(dpmodel): ulp tolerance for fp32/fp16 safe_norm parity (CI BLAS …

a625528

…variance)

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread deepmd/dpmodel/descriptor/dpa4_nn/so2.py

Han Wang added 2 commits June 12, 2026 00:08

style(dpmodel): NumPy capitalization; explicit dtypes for pylint

225c8c5

fix(dpmodel): xp-namespace conversion for remaining numpy buffers; Ke…

70702ce

…yError in collection lookup

wanghan-iapcm enabled auto-merge June 11, 2026 16:45

wanghan-iapcm added this pull request to the merge queue Jun 11, 2026

test(dpmodel): pin pt reference modules to CPU in parity tests (CUDA …

177c057

…runners)

wanghan-iapcm removed this pull request from the merge queue due to a manual request Jun 11, 2026

ci: retrigger checks

cb395c8

wanghan-iapcm enabled auto-merge June 11, 2026 23:07

Han Wang added 3 commits June 12, 2026 08:01

test(dpmodel): also patch import-time device snapshots for CPU parity…

0b37974

… pinning

test(dpmodel): run pt parity references on native device with device-…

33180fa

…conditional tolerances

test(dpmodel): move edge-cache fixture tensors to the pt device too

ed0e303

wanghan-iapcm added this pull request to the merge queue Jun 12, 2026

Merged via the queue into deepmodeling:master with commit 0de53e9 Jun 12, 2026
70 checks passed

wanghan-iapcm deleted the feat-dpmodel-dpa4 branch June 12, 2026 08:42

This was referenced Jun 12, 2026

feat(pt_expt): DPA4 descriptor and fitting (training + export) #5522

Merged

perf(dpa4): opt so3grid #5517

Closed

wanghan-iapcm mentioned this pull request Jun 16, 2026

feat(pt_expt): DPA4 inference (.pt2 freeze, DeepEval, C++, LAMMPS) #5540

Merged

wanghan-iapcm mentioned this pull request Jun 17, 2026

feat(dpmodel): port DPA4/SeZM SO3 grid projection (PR-A: dpmodel core) #5547

Closed

This was referenced Jun 18, 2026

perf(dpa4): opt so3grid (with pt_expt GridProduct wrapping fix) #5552

Merged

feat(dpmodel): complete DPA4/SeZM SO3 grid projection (mirror current pt) #5555

Merged

coderabbitai Bot mentioned this pull request Jun 28, 2026

refactor(dpmodel/dpa4) #5599

Merged

Uh oh!

Conversation

wanghan-iapcm commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Contents

Design: padded edges instead of sparse nonzero edges

Validation

Known limitations

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wanghan-iapcm commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wanghan-iapcm commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Design: padded edges instead of sparse `nonzero` edges

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

codecov Bot commented Jun 11, 2026 •

edited

Loading