feat(dpmodel): backend-agnostic DPA4 descriptor and fitting (core path)#5515
Conversation
for more information, see https://pre-commit.ci
|
Important Review skippedReview was skipped as selected files did not have any reviewable changes. 💤 Files selected but had no reviewable changes (1)
⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds a dpmodel (array‑API) implementation of the DPA4/SeZM descriptor, equivariant SO(3)/SO(2) building blocks, S2/Lebedev projection and spherical‑harmonics utilities, GLU energy fitting nets, pt‑compatible serialize/deserialize, device-alignment fixes, and extensive unit and cross-backend tests. ChangesDPA4 descriptor and fitting backend for dpmodel
Sequence Diagram(s)sequenceDiagram
participant Client
participant DescrptDPA4
participant EdgeCache
participant SO2Convolution
participant InteractionBlock
participant EquivariantFFN
Client->>DescrptDPA4: call(coord_ext, atype_ext, nlist, ...)
DescrptDPA4->>EdgeCache: build_edge_cache(...)
EdgeCache->>SO2Convolution: edge_cache, radial_feat
SO2Convolution->>InteractionBlock: message-passing outputs
InteractionBlock->>EquivariantFFN: block readout (l=0)
EquivariantFFN->>DescrptDPA4: scalar mixing -> descriptor
DescrptDPA4->>Client: return (descriptor, None, None, None, None)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/dpmodel/descriptor/dpa4_nn/edge_cache.py`:
- Around line 202-205: The final validity mask updates src_local_safe but leaves
nlist_safe containing pre-filtered neighbor indices, which can leak out-of-range
indices into neighbor_coord_index and cause xp.take to fail; after computing
mask = mask & (src_local >= 0) & (src_local < nloc) and creating src_local_safe,
also zero out entries in nlist_safe where mask is false (e.g., nlist_safe =
xp.where(mask, nlist_safe, xp.zeros_like(nlist_safe))) so nlist_safe is aligned
with the final mask and safe when mapping is None before downstream use in
neighbor_coord_index/xp.take.
In `@deepmd/dpmodel/descriptor/dpa4_nn/embedding.py`:
- Around line 155-160: The flattened index array (index) must be range-validated
before calling xp.take to prevent negative atype values from selecting the last
row silently; in the embedding extraction code that uses
self.adam_type_embedding, compute the flattened index from atype (as currently
done) and explicitly check for any index < 0 or index >= allowed_max (use
self.ntypes or the appropriate max id for this embedding including padding if
applicable) and raise a clear error if any are out of range, only then call
xp.take and reshape to (*atype.shape, self.embed_dim).
In `@deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py`:
- Around line 519-533: The constructor S2GridNet currently sets
grid_method="e3nn" which triggers S2GridProjector to reject that backend; change
the default value of the grid_method parameter in S2GridNet(...) from "e3nn" to
"lebedev" and make the identical default change for the grid_method parameter in
the public constructor/function in projection.py (where S2GridProjector is used)
so the default path uses the implemented Lebedev backend; update any related
docstring/default mentions to match "lebedev".
In `@deepmd/dpmodel/descriptor/dpa4_nn/norm.py`:
- Around line 215-217: Rename the single-letter loop variable l to a descriptive
name (e.g., degree) to satisfy Ruff E741: replace occurrences of "for l in
range(self.lmax + 1):" with "for degree in range(self.lmax + 1):" and update
uses of l inside the loop (e.g., w = scale / (2 * l + 1) and (2 * l + 1)) to use
degree instead; do the same for the other similar loop (the one that also
extends weights_list) so variables like weights_list, self.lmax, and scale
remain unchanged while only the loop variable is renamed.
In `@deepmd/dpmodel/descriptor/dpa4_nn/radial.py`:
- Around line 383-387: The stored self.adam_freqs is a NumPy array but
RadialBasis.call() multiplies it with backend arrays from build_edge_cache(),
causing backend/device mismatches; convert adam_freqs into the active array
namespace before runtime arithmetic by calling the array_api_compat backend's
asarray with the correct device (e.g., xp.asarray(freqs, device=device)) where
adam_freqs is set or right before use in RadialBasis.call(); update
creation/assignment of self.adam_freqs (and/or its usage in RadialBasis.call())
to ensure it is an xp array compatible with r/edge_len so r * freqs and r -
freqs use the same backend.
In `@deepmd/dpmodel/descriptor/dpa4_nn/wignerd.py`:
- Around line 245-249: The helper currently re-regularizes a caller-provided
edge_len (doing xp.sqrt(edge_len*edge_len + eps*eps)), which double-applies the
eps floor; change the logic so edge_len is computed with safe_norm(edge_vec,
eps) only when None and otherwise left unchanged. In other words, in the block
around edge_vec/edge_unit (wignerd.py) remove the else re-regularization and
simply use the passed-in edge_len; keep using safe_norm(edge_vec, eps) when
edge_len is None so callers (e.g. build_edge_cache) that already passed
safe_norm are not perturbed.
In `@deepmd/dpmodel/descriptor/dpa4.py`:
- Around line 374-376: Rename the single-letter loop/comprehension variable l to
a descriptive name (e.g., degree) wherever it's used with self.l_schedule to
avoid E741; specifically update the comprehensions that assign self.ebed_dims =
[get_so3_dim_of_lmax(l) for l in self.l_schedule] and self.rad_sizes_per_block =
[l + 1 for l in self.l_schedule], and any loops or comprehensions inside
_init_node_l_schedules that use l (also at the other occurrences around the
711-716 and 725 regions) so they use degree (or similar) instead.
- Around line 496-500: The film_scale_strength_log and film_shift_strength_log
are initialized as NumPy arrays but later used with xp.exp inside use_env_seed
(xp may be non-NumPy), causing backend/device mismatch; update the code so these
logs are converted to the active backend/device before calling xp.exp — either
initialize them using the active xp (when available) or, inside use_env_seed
before xp.exp(self.film_scale_strength_log[...]) /
xp.exp(self.film_shift_strength_log[...]), call xp.asarray(...) or xp.array(...)
(preserving dtype) to move the data onto xp/device; reference symbols:
film_scale_strength_log, film_shift_strength_log, use_env_seed, xp.exp.
In `@deepmd/dpmodel/fitting/dpa4_ener.py`:
- Around line 240-253: The _convert_key function currently uses asserts and
allows negative or out-of-range keys (leading to unintended negative indexing);
replace asserts with explicit validations: if key is int, check isinstance(key,
int) and 0 <= key < self.ntypes**self.ndim and raise ValueError on failure; if
key is str, parse to tuple but then validate the tuple length equals self.ndim
and each element is an int in range 0 <= tt < self.ntypes; if key is tuple,
validate same length and per-element bounds and types; raise TypeError for wrong
types and ValueError for out-of-range values so negative indices cannot slip
through and assertions are not relied upon.
In `@deepmd/dpmodel/utils/lebedev.py`:
- Line 68: The code currently coerces precision with int(precision) when
building rule_key which silently truncates invalid values (e.g., 11.9); change
this to validate and reject non-integer precision instead: in the function where
rule_key is created (the variable rule_key that uses f"{int(precision):03d}"),
assert that precision is an instance of int (or raise a ValueError/TypeError
with a clear message) before formatting, and then build rule_key with
f"{precision:03d}" so only true integer precisions are accepted.
In `@deepmd/dpmodel/utils/spherical_harmonics.py`:
- Around line 66-67: Guard against scalar inputs before indexing by checking the
rank of vecs first: convert vecs to an array if not already (np.asarray(vecs))
and use vecs.ndim (or np.ndim(vecs)) to detect scalars; if vecs.ndim == 0 (or
otherwise <1) or if vecs.shape[-1] != 3 then raise the existing ValueError.
Update the validation around the existing vecs check so scalars consistently
trigger the same ValueError instead of causing an IndexError.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6517a5c6-fe0b-4fdb-9145-25ca001d8bb4
📒 Files selected for processing (30)
deepmd/dpmodel/descriptor/__init__.pydeepmd/dpmodel/descriptor/dpa4.pydeepmd/dpmodel/descriptor/dpa4_nn/__init__.pydeepmd/dpmodel/descriptor/dpa4_nn/activation.pydeepmd/dpmodel/descriptor/dpa4_nn/attention.pydeepmd/dpmodel/descriptor/dpa4_nn/block.pydeepmd/dpmodel/descriptor/dpa4_nn/edge_cache.pydeepmd/dpmodel/descriptor/dpa4_nn/embedding.pydeepmd/dpmodel/descriptor/dpa4_nn/ffn.pydeepmd/dpmodel/descriptor/dpa4_nn/grid_net.pydeepmd/dpmodel/descriptor/dpa4_nn/indexing.pydeepmd/dpmodel/descriptor/dpa4_nn/norm.pydeepmd/dpmodel/descriptor/dpa4_nn/projection.pydeepmd/dpmodel/descriptor/dpa4_nn/radial.pydeepmd/dpmodel/descriptor/dpa4_nn/so2.pydeepmd/dpmodel/descriptor/dpa4_nn/so3.pydeepmd/dpmodel/descriptor/dpa4_nn/utils.pydeepmd/dpmodel/descriptor/dpa4_nn/wignerd.pydeepmd/dpmodel/fitting/__init__.pydeepmd/dpmodel/fitting/dpa4_ener.pydeepmd/dpmodel/utils/lebedev.pydeepmd/dpmodel/utils/lebedev_rules.npzdeepmd/dpmodel/utils/spherical_harmonics.pydeepmd/pt/model/descriptor/sezm_nn/lebedev.pysource/tests/common/dpmodel/test_descrpt_dpa4.pysource/tests/common/dpmodel/test_dpa4_ener.pysource/tests/common/dpmodel/test_lebedev_sh.pysource/tests/consistent/descriptor/test_dpa4.pysource/tests/consistent/fitting/test_dpa4_ener.pysource/tests/pt/model/test_dpa4_dpmodel_parity.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5515 +/- ##
==========================================
+ Coverage 81.52% 82.19% +0.67%
==========================================
Files 872 891 +19
Lines 97964 101600 +3636
Branches 4241 4240 -1
==========================================
+ Hits 79865 83511 +3646
+ Misses 16795 16787 -8
+ Partials 1304 1302 -2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/dpmodel/descriptor/dpa4_nn/so2.py`:
- Around line 394-397: Several raw NumPy buffers are not being aligned to the
current device; wrap each offending buffer with xp.asarray(..., device=device)
where they are materialized: ensure so2_linear.bias0 (the bias0 reshape block in
the SO(2) bias correction), AtomExcludeMask.type_mask, and NativeLayer.idt are
converted via xp.asarray(original_array, device=device) before any
reshape/assignment/usage so the backend (xp) handles device placement
consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9d0405b9-fbf2-4d0a-b117-606668787ddc
📒 Files selected for processing (20)
deepmd/dpmodel/descriptor/dpa4.pydeepmd/dpmodel/descriptor/dpa4_nn/activation.pydeepmd/dpmodel/descriptor/dpa4_nn/edge_cache.pydeepmd/dpmodel/descriptor/dpa4_nn/embedding.pydeepmd/dpmodel/descriptor/dpa4_nn/grid_net.pydeepmd/dpmodel/descriptor/dpa4_nn/norm.pydeepmd/dpmodel/descriptor/dpa4_nn/projection.pydeepmd/dpmodel/descriptor/dpa4_nn/radial.pydeepmd/dpmodel/descriptor/dpa4_nn/so2.pydeepmd/dpmodel/descriptor/dpa4_nn/so3.pydeepmd/dpmodel/fitting/dpa4_ener.pydeepmd/dpmodel/utils/exclude_mask.pydeepmd/dpmodel/utils/lebedev.pydeepmd/dpmodel/utils/network.pydeepmd/dpmodel/utils/spherical_harmonics.pysource/tests/common/dpmodel/test_dpa4_ener.pysource/tests/common/dpmodel/test_lebedev_sh.pysource/tests/consistent/descriptor/test_dpa4.pysource/tests/consistent/fitting/test_dpa4_ener.pysource/tests/pt/model/test_dpa4_dpmodel_parity.py
💤 Files with no reviewable changes (2)
- source/tests/consistent/descriptor/test_dpa4.py
- source/tests/consistent/fitting/test_dpa4_ener.py
🚧 Files skipped from review as they are similar to previous changes (11)
- deepmd/dpmodel/utils/spherical_harmonics.py
- deepmd/dpmodel/utils/lebedev.py
- source/tests/common/dpmodel/test_dpa4_ener.py
- source/tests/common/dpmodel/test_lebedev_sh.py
- deepmd/dpmodel/descriptor/dpa4_nn/projection.py
- deepmd/dpmodel/descriptor/dpa4_nn/radial.py
- deepmd/dpmodel/descriptor/dpa4_nn/embedding.py
- deepmd/dpmodel/descriptor/dpa4_nn/grid_net.py
- deepmd/dpmodel/descriptor/dpa4_nn/so3.py
- deepmd/dpmodel/descriptor/dpa4_nn/norm.py
- deepmd/dpmodel/fitting/dpa4_ener.py
|
pre-commit.ci run |
…odeling#5522) PR-2 of the DPA4/SeZM porting series (PR-1: deepmodeling#5515, dpmodel core). Makes DPA4 trainable and torch.export-compatible on the pt_expt backend. ## What's included - **Descriptor wrapper** (`deepmd/pt_expt/descriptor/dpa4.py`): thin `@torch_module` `DescrptDPA4` registered as `SeZM`/`sezm`/`DPA4`/`dpa4`, plus explicit converter registrations for the two dpmodel classes that cannot be auto-wrapped via serialize round-trip: `WignerDCalculator` (deserialize raises by design — rebuilt from `lmax`/`eps`/`precision`) and `SwiGLU` (parameter-free, no serialize). - **Trainable-weight promotion**: dpmodel stores DPA4 weights as bare numpy arrays (mirroring pt SeZM parameter names), which the generic ndarray rule registers as non-trainable buffers. A localized `_TRAINABLE_ATTRS` table re-registers exactly the attributes that are `nn.Parameter` in the pt SeZM implementation, after `__init__`/`deserialize`. Two supporting infra fixes: `_try_convert_list` now handles Optional-module lists (`SO2Convolution.non_linearities` ends with `None`), and `xp_asarray_nodetach` (`deepmd/dpmodel/array_api.py`) replaces `xp.asarray` on weight attributes in dpa4 dpmodel code so torch autograd is not silently detached. - **Fitting wrapper** (`deepmd/pt_expt/fitting/dpa4_ener.py`): `SeZMEnergyFittingNet` (`dpa4_ener`/`sezm_ener`), `GLUFittingNet`, and a `SeZMNetworkCollection` wrapper mirroring the `NetworkCollection` ModuleDict pattern so GLU weights are optimizer-visible. - **Model assembly** (`deepmd/pt_expt/model/get_model.py`): `get_sezm_model` dispatched for model type `dpa4`/`sezm`/`SeZM`/`DPA4`, mirroring pt's config handling (descriptor/fitting type defaults, `pair_exclude_types` reconciliation). Fail-fast `NotImplementedError` guards for pt-only extensions: spin, `bridging_method != none`, `lora`, `use_compile`, `preset_out_bias`. `enable_tf32` is accepted and ignored (pt_expt always runs "highest" matmul precision; documented). ## Tests - pt_expt trio (`source/tests/pt_expt/descriptor/test_dpa4.py`): consistency (serde round-trip + dpmodel reference), `torch.export.export`, `make_fx` through forward + `autograd.grad`; fitting analogs in `source/tests/pt_expt/fitting/test_dpa4_ener.py`. - **Parameter-gradient parity vs pt** (`source/tests/pt/model/test_dpa4_ptexpt_grad_parity.py`): weight-copied pt SeZM vs pt_expt, quadratic loss, every parameter's gradient compared 1:1 via serialize-tree alignment (83/83 parameters, no dead parameters, max deviation ~3e-11 rel from fp64 summation order). - Model assembly tests (`source/tests/pt_expt/model/test_get_model_dpa4.py`): aliases, defaults, exclude-types branches, all NIE guards. - Training end-to-end (`source/tests/pt_expt/test_training.py::test_training_loop_dpa4`) on the water fixture through the full trainer. - Cross-backend consistency rows enabled for pt_expt (`source/tests/consistent/`): 20 new comparisons pass at existing harness tolerances. - PR-1 parity suites stay green (381 passed) including for the small dpmodel fixes this PR needed (serialize via `to_numpy_array`, explicit-dtype zeros, ParameterList-safe `_load_variables`, `charge_spin` kwarg for atomic-model interface compatibility). ## Known limitations - No freeze/`.pt2` export, no Python `DeepEval`, no C++/LAMMPS inference — that is PR-3 of the series. - No multitask: `share_params` raises `NotImplementedError`. - No model-dict deserialization of pt `SeZMModel` checkpoints (PR-3). - pt-only SeZM extensions fail fast rather than being ported: spin, bridging/ZBL, LoRA, `use_compile`, `preset_out_bias`; `enable_tf32` ignored (numerically conservative). - `_try_convert_list`'s ParameterList branch does not respect `trainable=False` for list-valued weights (pre-existing backend-wide behavior). - CUDA validated at tolerance level only in CI; performance vs the pt Triton implementation is unbenchmarked. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added PyTorch-experimental DPA4/SeZM descriptor and energy-fitting components, with public package exports. * Added EnergyModel factory/dispatch to build DPA4/SeZM models from configs. * **Improvements** * Enhanced DPA4/SeZM runtime handling to keep gradient connectivity and ensure device-consistent tensor placement. * Updated DPA4 descriptor interface to accept optional `charge_spin` for compatibility. * **Tests** * Expanded experimental coverage with serialization/parity checks, FX tracing/export validation, trainable-parameter behavior tests, and descriptor/fitting gradient parity verification. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…eepmodeling#5540) PR-3 (final) of the DPA4/SeZM porting series — pt_expt **inference**: freeze to `.pt2`, Python `DeepEval`, pt→pt_expt checkpoint interop, C++ single-rank, and LAMMPS single-rank. Follows PR-1 (deepmodeling#5515, dpmodel core) and PR-2 (deepmodeling#5522, pt_expt training/export). ## What's included - **Model freeze to `.pt2`** (`deepmd/pt_expt/model/ener_model.py`, `deepmd/dpmodel/.../ener_model.py`, `deepmd/dpmodel/descriptor/dpa4.py`): register `EnergyModel` under `sezm_ener`/`dpa4_ener` model-type aliases so `BaseModel.deserialize` resolves a standard DPA4 energy model (whose fitting type is `sezm_ener`). Fixed a `torch.export` specialization where `int()` on symbolic shapes baked `nf*nloc` (embedding/so2/attention). - **NoPBC export fix** (`deepmd/dpmodel/descriptor/dpa4.py`): the `atype_ext[:, :nloc]` slice emitted a spurious `Ne(nall, nloc)` shape guard that crashed the compiled artifact when `nall==nloc` (no ghosts); replaced with `xp_take_first_n` (index_select). NoPBC now matches PBC. - **pt→pt_expt checkpoint interop** (`deepmd/pt_expt/model/model.py`): `BaseModel.deserialize` unwraps pt's bespoke `SeZMModel` serialization (`type:"SeZM"`, nested `sezm_atomic` atomic model with the pt-only dens head), validates versions, rejects unsupported features (bridging/lora/dens/active_mode) with `NotImplementedError`, and delegates to the standard path. - **Warn on silently-ignored flags** (`use_amp` descriptor, `enable_tf32` model): warn-once instead of silent drop. ## Tests - **Model freeze** `source/tests/pt_expt/model/test_dpa4_export.py`: dual-artifact `.pt2`, metadata, AOTI load, artifact-vs-eager parity (1e-10). *(CI-skipped — AOTI is slow; run locally.)* - **DeepEval parity vs pt** `source/tests/pt_expt/infer/test_dpa4_deep_eval.py`: pt `.pt` vs pt_expt `.pt2` energy/force/global-virial/atom-energy at fp64 1e-10, **PBC and NoPBC**; doubles as the checkpoint-interop proof. Per-atom virial compared by sum (pt's edge-scatter from deepmodeling#5518 redistributes it; global virial matches). *(CI-skipped — AOTI.)* - **Interop unit tests** `source/tests/pt_expt/model/test_dpa4_interop.py` (CI-runnable, no AOTI): happy-path pt-checkpoint→pt_expt round-trip + every guard branch + version validation + `@variables` filtering. - **Alias deserialize guard** + **use_amp/enable_tf32 warn-once** tests (CI-runnable). - **Fixture generator** `source/tests/infer/gen_dpa4.py` (+ wired into `source/install/test_cc_local.sh`). - **C++ single-rank** `source/api_cc/tests/test_deeppot_dpa4_ptexpt.cc`: 20 tests (double+float), dpa3-matched tolerances. Validated locally. - **LAMMPS single-rank** `source/lmp/tests/test_lammps_dpa4_pt2.py`: parity + `atom_modify map yes` + the deepmodeling#5450 no-atom-map fail-fast. **Validated on a GPU box (7 passed).** PR-1 parity suites stay green; the small dpmodel edits are parity-revalidated. ## Known limitations - **Single-rank only.** Multi-rank/MPI LAMMPS for DPA4 is deferred (no live multi-rank cell; the with-comm artifact compiles but its runtime is not exercised). DPA4 is a message-passing descriptor, so multi-rank follows the existing deepmodeling#5450/deepmodeling#5430 machinery in a later PR. - **No `.pth` (torch.jit) DPA4** — the pt backend has no `sezm_ener` *model* registration, so `.pth` freeze of a standard DPA4 energy model isn't available; not needed for the pt_expt inference path. - **Per-atom virial** is not compared element-wise pt-vs-pt_expt (only its global sum) — deepmodeling#5518 changed pt's edge-scatter distribution; both are correct, the distribution differs. - **AOTI tests are CI-skipped** (multi-minute compile) — the freeze/DeepEval paths are validated locally, not in CI; the interop/alias/warn tests give CI coverage of the non-AOTI logic. - **fp64 only**; fp32 freeze untested. CUDA validated at LAMMPS level on a GPU box; the AOTI parity numbers are from CPU fp64. - **`use_amp`/`enable_tf32`** remain functionally ignored (now warned) — by design for this series. - pt SeZM features out of scope (guarded `NotImplementedError`): spin, ZBL bridging, LoRA, dens/direct-force/denoising heads, SO3 grid projection, GridMLP, SO(2) attention extensions. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit # Release Notes * **New Features** * Enabled DPA4 model inference via the pt_expt backend using dual-artifact compilation. * Registered the EnergyModel under additional aliases: `sezm_ener` and `dpa4_ener`. * **Improvements** * Improved dynamic/symbolic shape handling across DPA4 components for export/tracing stability. * Enhanced pt SeZM/DPA4 checkpoint deserialization and normalization for interoperability. * Added one-time warnings when `use_amp` or `enable_tf32` settings are ineffective. * **Tests** * Added C++ and Python coverage for pt2 inference, LAMMPS integration, model export/freeze, parity, interop, and warning behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
What this PR does
Ports the core DPA4/SeZM descriptor and its GLU energy fitting from the PyTorch-only implementation (
deepmd/pt, #5448/#5503) to the backend-agnostic dpmodel layer (numpy / array-API). This is PR 1 of a 3-PR series (next: pt_expt wrappers + training; then.pt2freeze + Python/C++ inference).Contents
deepmd/dpmodel/utils/lebedev.py+ relocatedlebedev_rules.npz(single source of truth; pt re-wraps it),deepmd/dpmodel/utils/spherical_harmonics.py— general-lmax real spherical harmonics in pure numpy, matching e3nn'snormalize=True, normalization="norm"convention exactly (validated vs e3nn to ~1e-15), so dpmodel builds the S2-grid constants without e3nn/torch.deepmd/dpmodel/descriptor/dpa4_nn/— array-API ports of thesezm_nnmath modules (indexing, utils, radial, norm, wignerd, so3, activation, projection, grid_net, so2, attention, embedding, edge_cache, ffn, block), structurally mirroring the pt code 1:1.deepmd/dpmodel/descriptor/dpa4.py—DescrptDPA4, registereddpa4/DPA4/sezm/SeZM, serialization-compatible with ptDescrptSeZM(cross-deserialization tested in both directions).deepmd/dpmodel/fitting/dpa4_ener.py—SeZMEnergyFittingNet(GLU fitting), registereddpa4_ener/sezm_ener.Design: padded edges instead of sparse
nonzeroedgespt builds a sparse edge list via
torch.nonzeroand aggregates withindex_add_/scatter. The dpmodel port uses a shape-static padded edge layout (E = nf*nloc*nnei, node-contiguous, validity mask): per-edge math is identical, node aggregations and the segment softmax become masked reductions over thenneiaxis. This removes data-dependent shapes (torch.export-friendly for the upcoming pt_expt PR) and avoids non-standard scatter ops in array-API code. Masked slots carry a safe dummy direction before quaternion/Wigner evaluation so future autograd backward passes cannot be NaN-poisoned by dead slots.Validation
source/tests/pt/model/test_dpa4_dpmodel_parity.py(377 tests), including dual-cache tests proving pt-sparse vs dp-padded equivalence with adversarial garbage in masked slots.source/tests/common/dpmodel/): serialize round-trips, permutation equivariance, rotation invariance, masked-edge inertness, guard coverage.source/tests/consistent/descriptor/test_dpa4.py,source/tests/consistent/fitting/test_dpa4_ener.py(dpa3-precedent tolerances; f64 atol stricter than precedent).dpa4modelserialize()top-level shape for the follow-up PRs.Known limitations
NotImplementedErrornaming the config key at construction/deserialize: spin/charge embedding (add_chg_spin_ebd), DeNS dual fitting, ZBL/inner-clamp bridging, multitask FiLM/case embedding (dim_case_embd>0), LoRA, SO3-grid paths (ffn_so3_grid,node_wise_*,message_node_*),grid_mlp, attention-residual modes,layer_scale, attention projections.lebedev_quadrature=False(e3nn product grid) is rejected; the default config uses Lebedev.random_gammagauge augmentation is accepted and serialized but applied only when explicitly driven (numpy RNG; not stream-identical to torch's draw — gauge invariance is property-tested).Summary by CodeRabbit
New Features
Bug Fixes
Tests