feat(pt_expt): O(N) on-device NeighborGraph builders — vesin & nvalchemiops (PR-C)#5714
feat(pt_expt): O(N) on-device NeighborGraph builders — vesin & nvalchemiops (PR-C)#5714wanghan-iapcm wants to merge 13 commits into
Conversation
Whole-branch review M1: mirror ase_builder's empty-part-list guard so the vesin builder handles nf==0 without a torch.cat([]) RuntimeError. M3: extend the dpmodel fail-fast test to cover method='nv' (not just 'vesin').
|
Warning Review limit reached
Next review available in: 10 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThe PR adds PT experimental neighbor-graph builders for CUDA and vesin backends, updates ASE neighbor-graph filtering for virtual atoms, and adds regression tests for decoding, parity, dispatch, and plugin import state restoration. ChangesPT experimental neighbor-graph builders and validation
Plugin import state restoration
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5714 +/- ##
==========================================
- Coverage 81.24% 80.51% -0.73%
==========================================
Files 981 990 +9
Lines 109860 111001 +1141
Branches 4234 4235 +1
==========================================
+ Hits 89257 89375 +118
- Misses 19080 20101 +1021
- Partials 1523 1525 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
| World-2 counterpart of vesin_neighbor_list.py: instead of building the dense | ||
| quartet, it returns per-frame local (i, j, S), then delegates to the array-API | ||
| ``neighbor_graph_from_ijs`` (which recomputes ``edge_vec`` differentiably from | ||
| the ORIGINAL grad-carrying coords). torch-only => lives in pt_expt. |
There was a problem hiding this comment.
vesin multi-frame loops over frames in Python while nv batches natively. Could we confirm that batched GPU training routes through nv (not vesin) so the per-frame loop is not on the training hot path, and maybe document that vesin is intended for nf=1 inference / CPU?
There was a problem hiding this comment.
Confirmed: batched training never routes through vesin by default — neighbor_graph_method=None resolves to the "dense" converter (make_model.py _resolve_graph_method), and vesin/nv are explicit opt-in only. Documented the scope in the module docstring in 6905918: vesin = nf==1 inference / CPU (per-frame loop ~1 ms/frame call overhead measured on a T4); batched multi-frame GPU work should opt into nv, which batches all frames in one kernel.
| # and search all frames in ONE kernel via batch_idx/batch_ptr. | ||
| if periodic: | ||
| cell = box.reshape(nf, 3, 3).to(device=device, dtype=coord.dtype) | ||
| coord = normalize_coord(coord, cell) |
There was a problem hiding this comment.
Minor consistency note: build_neighbor_graph_nv normalizes coords before neighbor_graph_from_ijs while build_neighbor_graph_vesin passes the original unnormalized coords. Both are internally self-consistent (S matches the coords each search used), but a one-line comment on each explaining why they differ would help future readers.
There was a problem hiding this comment.
Added cross-referencing comments in both builders in 6905918: nvalchemiops requires in-cell positions, so nv searches AND recomputes edge_vec on the normalized coords (S matches what the search saw; normalize_coord is a differentiable lattice translation with identity gradient); vesin's cell list handles unwrapped positions natively, so it keeps the original grad-carrying coords.
| dst = edge_idx // max_neighbors # flattened center | ||
| src = neighbor_matrix.reshape(-1).index_select(0, edge_idx).to(torch.int64) | ||
| shift = shifts.reshape(-1, 3).index_select(0, edge_idx).to(torch.int64) | ||
| frame_idx = (dst // nloc).to(torch.int64) # frame of the edge | ||
| center_local = (dst % nloc).to(torch.int64) # i = center | ||
| src_local = (src % nloc).to(torch.int64) # j = neighbor | ||
|
|
||
| return neighbor_graph_from_ijs( | ||
| center_local, src_local, shift, coord, box_out, frame_idx, nloc, layout=layout | ||
| ) |
There was a problem hiding this comment.
This decode (matrix -> (i, j, S)) is pure torch and CPU-runnable, but it currently lives inside the CUDA-only build_neighbor_graph_nv, so it only runs under the opt-in CUDA suite. Could we extract it into a standalone helper (taking neighbor_matrix, num_neighbors, shifts, nloc, coord, box) and unit-test it on the default CPU CI with synthetic inputs? The GPU neighbor_list search stays opt-in, but the regression-prone arithmetic (// max_neighbors, % nloc, frame isolation, neighbor_graph_from_ijs) would then be behind the normal merge gate. Not blocking since nv is explicit opt-in.
There was a problem hiding this comment.
Done in 6905918: extracted nv_matrix_to_ijs(neighbor_matrix, num_neighbors, shifts, nloc) -> (i, j, S, frame_idx) — pure torch, device-agnostic — and the builder now calls it. New default-CI CPU test test_nv_matrix_decode.py covers the regression-prone arithmetic with a hand-checked two-frame fixture (incl. stale-slot garbage that must be ignored), an all-empty matrix, and a random-vs-brute-force-oracle sweep. The GPU neighbor_list search stays behind the opt-in CUDA suite.
| pytestmark = pytest.mark.skipif( | ||
| not (torch.cuda.is_available() and is_nv_available()), | ||
| reason="nvalchemiops requires CUDA + nvalchemi-toolkit-ops", | ||
| ) |
There was a problem hiding this comment.
These CUDA integration tests correctly stay gated. The gap is separate: the CPU-runnable decode in nv_graph_builder.py has no default-CI test, so a shared-code refactor could regress it silently without the CUDA tag. Suggest a CPU unit test of the extracted decode (see my comment on the builder).
There was a problem hiding this comment.
Covered by the extraction in 6905918 — nv_matrix_to_ijs now has a default-CI CPU unit test (source/tests/pt_expt/utils/test_nv_matrix_decode.py: hand-checked, empty, random-vs-oracle), so a shared-code refactor of the decode can no longer regress silently without the CUDA tag.
… (review) Address iProzd's deepmodeling#5714 review: - extract the nv dense-matrix -> (i, j, S) decode into nv_matrix_to_ijs (pure torch, device-agnostic) and unit-test it on the default CPU CI with hand-checked, empty, and random-vs-oracle synthetic inputs; the CUDA neighbor_list search stays behind the opt-in GPU suite - document why nv searches/recomputes on NORMALIZED coords while vesin uses the ORIGINAL coords (nvalchemiops needs in-cell positions; vesin handles unwrapped positions natively) - each self-consistent with its S - document the vesin per-frame Python loop scope: intended for nf==1 inference / CPU; never a default (neighbor_graph_method=None resolves to 'dense'); batched multi-frame GPU work should use nv
|
|
||
| Dispatches on ``self._neighbor_graph_method``: ``dense``/``ase`` run | ||
| backend-agnostic (numpy); ``vesin``/``nv`` run on-device (torch, O(N)). | ||
| All backends emit the SAME neighbor set (carry-all, sel-free), so the |
There was a problem hiding this comment.
The "SAME neighbor set" statement is accurate for normal real atoms, but it is worth documenting one boundary here: the dense builder filters atype < 0 padding/placeholder atoms during graph construction, while the O(N) builders (ase/vesin/nv, matching the SeZM edge-schema style) build geometric edges first and rely on downstream masks/consumers to suppress non-real atoms. For inputs with mixed-type padding this is not literally the same edge set, so downstream graph consumers that need dense parity should explicitly mask/filter edges touching atype < 0, or this docstring should state that responsibility.
There was a problem hiding this comment.
Valid — and rather than documenting the divergence, fixed the builders to honor the contract in 6396068: ase/vesin/nv now post-filter the (i, j, S) edge list so virtual atoms (atype < 0) are excluded as both centers and neighbors, matching the dense reference builder (builder.py's documented contract). The "SAME neighbor set" docstring is therefore accurate again, including for mixed-type padding inputs, and downstream graph consumers need no extra masking. Set-equality tests vs the dense builder with a virtual atom added for all three: ase on the default CI, vesin (guarded), nv (CUDA-gated, validated on a T4: 11/11 with the vesin suite). Note the ase-builder half of this bug also exists on master (PR-A vintage) and is fixed here in passing.
iProzd
left a comment
There was a problem hiding this comment.
Agree with @OutisLi, and I think this is a correctness gap for the opt-in O(N) builders, not just a docstring scope note. Confirmed in code:
build_neighbor_graph(dense) drops virtual-atom edges viakeep_mask = ... & ~(extended_atype<0) & ~(atype<0).vesin/nv/asebuild geometric edges from coords only and never look atatype.forward_common_atomic_graphmasks edges only forpair_excl; foratype<0it only clamps the type and zeroes the virtual atom's own output — it does not drop edges from a virtual neighbor into a real center.
So for inputs with atype<0 (e.g. mixed-natoms padding), a real center that has a virtual atom within rcut gets an extra type-0 edge under vesin/nv/ase but not under dense, polluting the real atom's descriptor/energy/force. The "SAME neighbor set / parity" claim then holds only for all-real-atom inputs, which is likely all the current set-equality fixtures cover.
Rather than only documenting the caller's responsibility, could the O(N) builders filter edges touching atype<0 (drop where center or neighbor-owner is virtual) to match dense, so the parity claim stays literally true? A set-equality test with a virtual-atom fixture would lock it in. Default users are unaffected (default resolves to dense), so this is opt-in-only severity, but it undermines the stated dense parity.
OutisLi
left a comment
There was a problem hiding this comment.
Please rework the nv graph path to reuse the existing NvNeighborList implementation rather than adding a second nvalchemiops search stack. The new graph builder duplicates the kernel setup, capacity loop, device handling, and matrix decode that NvNeighborList already owns, which increases maintenance cost and has already diverged in behavior. This can be fixed by extending NvNeighborList with a graph-oriented return mode or a thin build_graph method while preserving the existing extended/edges modes so current SeZM users are not affected.
| return center_local, src_local, shift, frame_idx | ||
|
|
||
|
|
||
| def build_neighbor_graph_nv( |
There was a problem hiding this comment.
This should reuse NvNeighborList instead of introducing a parallel nvalchemiops implementation. NvNeighborList already owns the matrix search, capacity growth, device context, and extended/edges conversions used by the SeZM path. Adding build_neighbor_graph_nv separately means future fixes must be kept in sync manually, and this copy has already drifted by omitting the existing _input_device_context(device) guard. Please extend NvNeighborList with a graph return mode or a thin build_graph method that produces NeighborGraph, while preserving the current extended and edges return modes so existing SeZM behavior stays unchanged.
There was a problem hiding this comment.
Agreed on single-sourcing — implemented in 2f7ea9e with the dependency direction inverted relative to the suggestion (per project roadmap): the graph-builder module now OWNS the search (nv_search_matrix: device-context pinning — restoring the _input_device_context guard this copy had dropped, good catch — periodic normalization, batch setup, grow-until-fit loop) and NvNeighborList.build delegates to it via a lazy import. Rationale: the World-1 strategy layer is scheduled for deprecation with the dense-nlist path, so the long-lived graph module is the owner and the legacy class is the caller — when World-1 is removed, nothing needs to move again. extended/edges return modes are byte-unchanged; GPU-validated on a T4 (nv builder 18/18, NvNeighborList/nlist_backend suites 75 passed + 26 subtests).
|
Others LGTM. |
OutisLi
left a comment
There was a problem hiding this comment.
The same design concern applies to both nv and vesin: graph-form NeighborGraph construction should extend the existing NeighborList strategies instead of adding separate parallel builders. VesinNeighborList and NvNeighborList already provide the O(N) search backends used by SeZM through extended/edges return modes; the graph path should be another output contract on the same strategy layer, not a second copy of each backend. Please unify this design while preserving the existing extended and edges modes for current SeZM users.
| ) | ||
|
|
||
|
|
||
| def build_neighbor_graph_vesin( |
There was a problem hiding this comment.
Same structural issue as the nv path: this duplicates the existing VesinNeighborList search/conversion logic instead of extending the established NeighborList strategy. VesinNeighborList already owns the vesin.torch O(N) builder and supports extended/edges return modes for the SeZM path. Please make NeighborGraph another output contract of that shared strategy layer, or provide a thin build_graph method there, while preserving existing extended and edges behavior.
There was a problem hiding this comment.
Same resolution as the nv path, in 2f7ea9e: the raw vesin search is single-sourced as vesin_graph_builder.vesin_search_ijs (device pinning included) and BOTH VesinNeighborList._build_single and _build_single_edges now call it (lazy import to break the module cycle) — graph module owns, deprecation-bound strategy delegates. Extended/edges behavior byte-unchanged (ss float cast kept at the _build_single call site so ss @ box math is identical); full vesin + neighbor_list suites pass on CPU and T4.
OutisLi
left a comment
There was a problem hiding this comment.
Please also restore the graph-lower eligibility check in the pt_expt graph dispatch. Explicit neighbor_graph_method values should not bypass the same mixed_types/uses_graph_lower contract enforced by dpmodel and by graph export.
| ng = build_neighbor_graph(cc, atype, bb, rcut) | ||
| elif method == "ase": | ||
| ng = build_neighbor_graph_ase(cc, atype, bb, rcut) | ||
| elif method == "vesin": |
There was a problem hiding this comment.
This pt_expt override no longer performs the graph-eligibility check that dpmodel keeps before building a NeighborGraph. _resolve_graph_method() only protects the default neighbor_graph_method=None path; an explicit neighbor_graph_method="dense", "ase", "vesin", or "nv" reaches this dispatch even for descriptors that do not support the graph lower (for example non-mixed descriptors, DPA1 with attention layers, or other configs where uses_graph_lower() is false). Please restore the same mixed_types() and descriptor.uses_graph_lower() fail-fast check here before dispatching to any builder, and add a regression test with a non-graph-eligible descriptor explicitly passing one of these methods.
There was a problem hiding this comment.
Valid — fixed in a6eb9bc: _call_common_graph now carries the same mixed_types() and descriptor.uses_graph_lower() fail-fast as dpmodel (NotImplementedError), so an explicit neighbor_graph_method can no longer reach the builders for ineligible descriptors. Regression test drives se_e2_a through all four explicit methods and expects the raise.
| auto_batch_size: bool | int | AutoBatchSize = True, | ||
| neighbor_list: Optional["ase.neighborlist.NewPrimitiveNeighborList"] = None, | ||
| nlist_backend: str = "auto", | ||
| neighbor_graph_method: str = "dense", |
There was a problem hiding this comment.
Please document that neighbor_graph_method is graph-form only. For regular .pte / nlist-form .pt2 inference this value is currently ignored because _eval_model() only dispatches to _eval_model_graph() when metadata["lower_input_kind"] == "graph"; users who want the nlist path O(N) builder must use nlist_backend, not neighbor_graph_method. It would be clearer to either fail fast / warn when neighbor_graph_method != "dense" on a non-graph artifact, or spell out this no-op behavior in the constructor docstring.
There was a problem hiding this comment.
Done in d2cf920: both knobs are now documented in the DeepEval docstring (which controls which path, and that they don't overlap), and a non-default neighbor_graph_method on a non-graph artifact now RAISES at construction instead of being silently ignored, pointing the user at nlist_backend. Regression test on a nlist-form .pt2.
OutisLi
left a comment
There was a problem hiding this comment.
Please revisit the public neighbor-backend API design. The PR now exposes two knobs, nlist_backend and neighbor_graph_method, whose scopes are hard to infer from their names and whose supported values differ. From a user perspective both choose the neighbor construction backend, so the API should be made more natural and unambiguous. In particular, if nv is a supported O(N) backend, it should not work only for graph-form .pt2 while the nlist path has no nv option, especially since NvNeighborList already exists.
| auto_batch_size: bool | int | AutoBatchSize = True, | ||
| neighbor_list: Optional["ase.neighborlist.NewPrimitiveNeighborList"] = None, | ||
| nlist_backend: str = "auto", | ||
| neighbor_graph_method: str = "dense", |
There was a problem hiding this comment.
This adds a second public neighbor-builder knob next to nlist_backend, but the names and value sets do not make the distinction clear. nlist_backend controls the nlist/extended lower path and accepts auto/vesin/native; neighbor_graph_method controls only graph-form .pt2 and accepts dense/ase/vesin/nv. A user who wants an O(N) backend cannot tell which knob to use, and nv oddly works only on the graph path even though NvNeighborList already exists for the nlist strategy layer. Please consolidate this into a single backend-selection API, or otherwise rename/fail-fast/document the two paths clearly, and make nv support consistent across both nlist and graph lower schemas without breaking the existing SeZM extended/edges behavior.
There was a problem hiding this comment.
Agreed the two-knob split is not a good end state. What this PR now does: documents the split explicitly and fail-fasts on misuse (d2cf920), so a user can no longer pick the wrong knob silently. Full consolidation into a single backend-selection API (and nv on the nlist path) is deferred deliberately: the dense-nlist lower is scheduled for total deprecation in favor of the graph schema (see the NeighborGraph design discussion), at which point nlist_backend disappears and neighbor_graph_method becomes THE selector — consolidating now would mean building API surface on the path being removed. Happy to reorder if you think the deprecation is too far out.
| auto_batch_size: bool | int | AutoBatchSize = True, | ||
| neighbor_list: Optional["ase.neighborlist.NewPrimitiveNeighborList"] = None, | ||
| nlist_backend: str = "auto", | ||
| neighbor_graph_method: str = "dense", |
There was a problem hiding this comment.
One performance note for the eventual unified/auto backend selection: in the SeZM path we found that CPU + nf == 1 is better served by vesin than nvalchemiops, because vesin has the lighter single-system cell-list path while nvalchemiops/Warp has more setup overhead to amortize. The current SeZM selector therefore uses vesin for device.type == "cpu" and nf == 1, and prefers nv for CUDA or multi-frame batches when available. This may be worth carrying over when designing the common default/auto policy.
There was a problem hiding this comment.
Thanks — carried into the vesin builder's scope note and it matches our benchmark data (T4, diamond C, rcut 6): vesin and nv are equal within noise at nf==1 on CUDA, and vesin's per-frame Python loop (~1 ms/frame) makes nv the clear choice for multi-frame batches; your CPU+nf==1 vesin preference is consistent with vesin's lighter single-system path. Will encode exactly this policy (cpu+nf==1 -> vesin; CUDA/multi-frame -> nv when available) when the auto selector lands with the unified API.
test_plugin popped deepmd.pt_expt from sys.modules without restoring it,
leaving the package's cached submodules bound to a dead parent. Any later
import of a cached submodule (e.g. deepmd.pt_expt.infer.deep_eval)
re-created a BARE parent whose utils/infer attributes were never rebound,
and mock.patch('deepmd.pt_expt.utils...') in
test_deep_eval_serialize_api failed with AttributeError under py3.10's
mock target resolution. Shard-order dependent: this PR's new test files
reshuffled CI shard 4 and exposed the latent master-side hygiene bug.
Snapshot the whole deepmd.pt_expt module tree before the re-import and
restore it (including the parent-package attribute) afterwards.
The dense reference builder filters virtual atoms (atype < 0) as both centers and neighbors during construction, but the geometric ase/vesin/nv searches are type-blind, so with virtual/padding atoms present the edge sets diverged from the dense builder and the 'SAME neighbor set' contract was violated (OutisLi review, deepmodeling#5714). Post-filter the (i, j, S) edge list in all three builders by the atype of both endpoints. Set-equality tests vs the dense builder with a virtual atom added for ase (default CI), vesin (guarded), and nv (CUDA-gated).
…egacy strategies delegate The shared search helpers now live in the World-2 graph-builder modules (vesin_graph_builder.vesin_search_ijs, nv_graph_builder.nv_search_matrix), which are the long-term primary path. The legacy VesinNeighborList and NvNeighborList classes (scheduled for deprecation with the dense-nlist path) import and call these helpers rather than owning their own copies of the same logic. Direction is graph-builder-centric (World-1 strategies are callers, not owners), matching the reviewer's intent and the project's deprecation roadmap. Key details: - vesin_search_ijs: returns (ii, jj, ss) as int64; handles the zeros-box for non-periodic internally; pins torch.device for vesin's internal allocs. _build_single casts ss to float at the call site (unchanged math for ss@box). _build_single_edges uses int64 ss directly (edge_schema_from_ij_shifts accepts int shifts). Function-level import in vesin_neighbor_list avoids a module-level cycle (vesin_graph_builder imports is_vesin_torch_available from vesin_neighbor_list). - nv_search_matrix: wraps the full search pipeline including _input_device_context pinning (restoring the guard the nv graph builder previously omitted), periodic normalization, batch tensor construction, and the grow-until-fit capacity loop. NvNeighborList.build uses a function-level import (lazy, avoids pt->pt_expt module-level cycle). Post-search code in build() is wrapped in its own _input_device_context block so device behavior is byte-unchanged relative to the original single-context design. - _matrix_to_extended_inputs, edge_schema_from_neighbor_matrix, and _truncate_to_sel_compiled are entirely unchanged (extended/edges modes byte-identical). - _grow_search_capacity and _input_device_context remain in nv_nlist.py for callers that import them directly (test_nv_nlist imports _input_device_context).
…ble descriptors Mirror the dpmodel guard in the pt_expt _call_common_graph override: _resolve_graph_method's eligibility check only protects the default (None) path, so an EXPLICIT method reached the builders for descriptors without a graph lower. Regression test drives se_e2_a (mixed_types False) through all four explicit methods and expects NotImplementedError. Addresses OutisLi review on deepmodeling#5714.
…backend split neighbor_graph_method is consumed only by graph-form .pt2 eval; a non-default value on a nlist-form artifact was silently ignored, misleading users who wanted an O(N) builder (that knob is nlist_backend). Document both knobs in the DeepEval docstring (incl. the planned consolidation at the dense-nlist deprecation) and raise at construction when neighbor_graph_method != 'dense' on a non-graph artifact; regression test on a nlist-form .pt2. Addresses OutisLi review on deepmodeling#5714.
| nf, nloc = atype.shape[:2] | ||
| target_neighbors = int(sum(sel)) | ||
| coord = coord.reshape(nf, nloc, 3) | ||
| periodic = box is not None |
| from deepmd.pt.utils.nv_nlist import ( | ||
| _input_device_context, | ||
| choose_nv_nlist_method, | ||
| is_nv_available, | ||
| ) |
| nf = coord.shape[0] if coord.ndim == 3 else 1 | ||
| coord = coord.reshape(nf, -1, 3) | ||
| nloc = coord.shape[1] | ||
| periodic = box is not None |
NeighborGraph PR-C — O(N) on-device graph builders (vesin / nv)
Third PR of the NeighborGraph series (after #5581 foundation, #5583 PR-A dpa1 graph forward, #5604 PR-B
.pt2/C++). Adds two O(N) carry-all NeighborGraph builders behind the World-2neighbor_graph_methoddispatcher, replacing PR-A's O(N²)densesearch / per-frame ASE stopgap with on-device cell lists.What
build_neighbor_graph_vesin(deepmd/pt_expt/utils/vesin_graph_builder.py) —vesin.torchcell list. Device-following: runs the search on the input tensor's device (CUDA kernel on CUDA input, CPU cell list otherwise).build_neighbor_graph_nv(deepmd/pt_expt/utils/nv_graph_builder.py) — nvalchemiops GPU cell list, frame-batched (batch_idx/batch_ptr, one kernel for all frames — no Python loop). CUDA-only.build_neighbor_graph_ase: search → per-frame local(i, j, S)→neighbor_graph_from_ijs(...), which recomputesedge_vecdifferentiably from the original grad-carrying coords.neighbor_graph_method ∈ {"legacy","dense","ase","vesin","nv"}) and DeepEval.pt2graph inference (newneighbor_graph_methodkwarg, default"dense"→ existing inference byte-identical). dpmodel/jax fail-fast onvesin/nv(torch/CUDA-only).Perf-only: all builders emit the SAME neighbor set as
dense(carry-all,sel=normalization-only), proven by exact set-equality; energy/force/virial are unchanged (parity 1e-12 CPU / 1e-10 CUDA).Layering
dpmodelstays torch-free: vesin/nv builders live inpt_expt; the dpmodel dispatch only carries a fail-fast message.is_vesin_torch_available()/is_nv_available(),ImportErrorwith an install hint on absence.deepmd/pt/utils/nv_nlist.py:_matrix_to_extended_inputsStep-1 extraction.Testing
test_nv_graph_builder.py: 4 passed (set-equality vs dense periodic+non-periodic, frame-batch, differentiableedge_vec).test_graph_builder_dispatch.py: 3 passed on CUDA (vesin parity 1e-10, dpmodel reject vesin+nv, nv parity 1e-10).test_graph_deepeval.py: 6 passed on CUDA (.pt2graph dense parity + vesin, AOTI compile +torch.as_tensorextraction).Known limitations
vesin.torch.computeis single-system (no batch dim), so multi-frame vesin loops over frames (each an O(N) search;nf=1inference has zero loop cost). nv batches natively — the loop-free path for batched training.pyproject; no"auto"selector (explicit strings only).search_capacity = max(64, nloc)initial heuristic + 1.25× grow loop (.item()host-sync per grow).from_ijs(differentiable lattice translation, identity gradient;edge_vecis lattice-invariant) — consistent with the pt nv-nlist path.Implements
plan_neighbor_graph_prC_implementation; design spec: discussion wanghan-iapcm#4.Summary by CodeRabbit
type < 0) from graph edges consistently for ASE-based neighbor graphs.