Skip to content

refact (pt_expt): provide infrastructure for converting dpmodel classes to PyTorch modules. #5204

Merged
njzjz merged 34 commits into
deepmodeling:masterfrom
wanghan-iapcm:refact-auto-setattr
Feb 9, 2026
Merged

refact (pt_expt): provide infrastructure for converting dpmodel classes to PyTorch modules. #5204
njzjz merged 34 commits into
deepmodeling:masterfrom
wanghan-iapcm:refact-auto-setattr

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Feb 8, 2026

Copy link
Copy Markdown
Collaborator

consider after the merge of #5194

automatically wrapping dpmodel classes (array_api_compat-based) as PyTorch modules. The key insight is to detect attributes by their value type rather than by hard-coded names.

Summary by CodeRabbit

  • New Features
    • Registry-driven conversion for DP objects to PyTorch modules enabling automatic wrapper creation.
    • New PyTorch-friendly descriptor variants with stable forward outputs for se_e2_a and se_r.
    • PyTorch-wrapped exclude-mask utilities and a NetworkCollection of wrapped network types for proper module/state handling.
    • Device-aware tensor conversion and robust handling of numpy buffers and None-valued buffers for reliable serialization/movement.

@codecov

codecov Bot commented Feb 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.73684% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.00%. Comparing base (5c2ca51) to head (55e094e).
⚠️ Report is 159 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/pt_expt/common.py 91.42% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5204   +/-   ##
=======================================
  Coverage   81.99%   82.00%           
=======================================
  Files         724      724           
  Lines       73807    73801    -6     
  Branches     3616     3615    -1     
=======================================
+ Hits        60519    60520    +1     
+ Misses      12124    12118    -6     
+ Partials     1164     1163    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wanghan-iapcm wanghan-iapcm changed the title refact: provide infrastructure for converting dpmodel classes to PyTorch modules. refact (pt_expt): provide infrastructure for converting dpmodel classes to PyTorch modules. Feb 8, 2026
Comment thread deepmd/pt_expt/common.py Outdated
Comment thread deepmd/pt_expt/common.py Outdated
@wanghan-iapcm wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Feb 8, 2026
@github-actions github-actions Bot removed the Test CUDA Trigger test CUDA workflow label Feb 8, 2026
@wanghan-iapcm wanghan-iapcm requested a review from njzjz February 9, 2026 11:54
@njzjz njzjz added this pull request to the merge queue Feb 9, 2026
Merged via the queue into deepmodeling:master with commit 97d8ded Feb 9, 2026
70 checks passed
@wanghan-iapcm wanghan-iapcm deleted the refact-auto-setattr branch February 10, 2026 02:40
github-merge-queue Bot pushed a commit that referenced this pull request Feb 10, 2026
# EmbeddingNet Refactoring: Factory Function to Concrete Class

## Summary

This refactoring converts `EmbeddingNet` from a factory-generated
dynamic class to a concrete class in the dpmodel backend. This change
enables the auto-detection registry mechanism in pt_expt to work
seamlessly with EmbeddingNet attributes.

This PR is considered after #5194 and #5204

## Motivation

**Before**: `EmbeddingNet` was created by a factory function
`make_embedding_network(NativeNet, NativeLayer)`, producing a
dynamically-typed class `make_embedding_network.<locals>.EN`. This
caused two problems:

1. **Cannot be registered**: Dynamic classes can't be imported or
registered at module import time in the pt_expt registry
2. **Name-based hacks required**: pt_expt wrappers had to explicitly
check for `name == "embedding_net"` in `__setattr__` instead of using
the type-based auto-detection mechanism

**After**: `EmbeddingNet` is now a concrete class that can be registered
in the pt_expt auto-conversion registry, eliminating the need for
name-based special cases.

## Changes

### 1. dpmodel: Concrete `EmbeddingNet` class

**File**: `deepmd/dpmodel/utils/network.py`

- Replaced factory-generated class with concrete
`EmbeddingNet(NativeNet)` class
- Moved constructor logic from factory into `__init__`
- Fixed `deserialize` to use `type(obj.layers[0])` instead of hardcoding
`super(EmbeddingNet, obj)`, allowing pt_expt subclass to preserve its
converted torch layers
- Kept `make_embedding_network` factory for pt/pd backends that use
different base classes (MLP)

```python
class EmbeddingNet(NativeNet):
    """The embedding network."""

    def __init__(self, in_dim, neuron=[24, 48, 96], activation_function="tanh",
                 resnet_dt=False, precision=DEFAULT_PRECISION, seed=None,
                 bias=True, trainable=True):
        layers = []
        i_in = in_dim
        if isinstance(trainable, bool):
            trainable = [trainable] * len(neuron)
        for idx, ii in enumerate(neuron):
            i_ot = ii
            layers.append(
                NativeLayer(
                    i_in, i_ot, bias=bias, use_timestep=resnet_dt,
                    activation_function=activation_function, resnet=True,
                    precision=precision, seed=child_seed(seed, idx),
                    trainable=trainable[idx]
                ).serialize()
            )
            i_in = i_ot
        super().__init__(layers)
        self.in_dim = in_dim
        self.neuron = neuron
        self.activation_function = activation_function
        self.resnet_dt = resnet_dt
        self.precision = precision
        self.bias = bias

    @classmethod
    def deserialize(cls, data):
        data = data.copy()
        check_version_compatibility(data.pop("@Version", 1), 2, 1)
        data.pop("@Class", None)
        layers = data.pop("layers")
        obj = cls(**data)
        # Use type(obj.layers[0]) to respect subclass layer types
        layer_type = type(obj.layers[0])
        obj.layers = type(obj.layers)(
            [layer_type.deserialize(layer) for layer in layers]
        )
        return obj
```

### 2. pt_expt: Wrapper and registration

**File**: `deepmd/pt_expt/utils/network.py`

- Created `EmbeddingNet(EmbeddingNetDP, torch.nn.Module)` wrapper
- Converts dpmodel layers to pt_expt `NativeLayer` (torch modules) in
`__init__`
- Registered in auto-conversion registry

```python
class EmbeddingNet(EmbeddingNetDP, torch.nn.Module):
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        torch.nn.Module.__init__(self)
        EmbeddingNetDP.__init__(self, *args, **kwargs)
        # Convert dpmodel layers to pt_expt NativeLayer
        self.layers = torch.nn.ModuleList(
            [NativeLayer.deserialize(layer.serialize()) for layer in self.layers]
        )

    def __call__(self, *args: Any, **kwargs: Any) -> Any:
        return torch.nn.Module.__call__(self, *args, **kwargs)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return self.call(x)

register_dpmodel_mapping(
    EmbeddingNetDP,
    lambda v: EmbeddingNet.deserialize(v.serialize()),
)
```

### 3. TypeEmbedNet: Simplified to use registry

**File**: `deepmd/pt_expt/utils/type_embed.py`

- No longer needs name-based `embedding_net` check in `__setattr__`
- Uses common `dpmodel_setattr` which auto-converts via registry
- Imports `network` module to ensure `EmbeddingNet` registration happens
first

```python
class TypeEmbedNet(TypeEmbedNetDP, torch.nn.Module):
    def __setattr__(self, name: str, value: Any) -> None:
        # Auto-converts embedding_net via registry
        handled, value = dpmodel_setattr(self, name, value)
        if not handled:
            super().__setattr__(name, value)
```

## Tests

### dpmodel tests

**File**: `source/tests/common/dpmodel/test_network.py`

Added to `TestEmbeddingNet` class:

1. **`test_is_concrete_class`**: Verifies `EmbeddingNet` is now a
concrete class, not factory output
2. **`test_forward_pass`**: Tests dpmodel forward pass produces correct
shapes
3. **`test_trainable_parameter_variants`**: Tests different trainable
configurations (all trainable, all frozen, mixed)

(The existing `test_embedding_net` test already covers
serialization/deserialization round-trip)

### pt_expt integration tests

**File**: `source/tests/pt_expt/utils/test_network.py`

Created `TestEmbeddingNetRefactor` test suite with 8 tests:

1. **`test_pt_expt_embedding_net_wraps_dpmodel`**: Verifies pt_expt
wrapper inherits correctly and converts layers
2. **`test_pt_expt_embedding_net_forward`**: Tests pt_expt forward pass
returns torch.Tensor
3. **`test_serialization_round_trip_pt_expt`**: Tests pt_expt
serialize/deserialize
4. **`test_deserialize_preserves_layer_type`**: Tests the key fix -
`deserialize` uses `type(obj.layers[0])` to preserve pt_expt's torch
layers
5. **`test_cross_backend_consistency`**: Tests numerical consistency
between dpmodel and pt_expt
6. **`test_registry_converts_dpmodel_to_pt_expt`**: Tests
`try_convert_module` auto-converts dpmodel to pt_expt
7. **`test_auto_conversion_in_setattr`**: Tests `dpmodel_setattr`
auto-converts EmbeddingNet attributes
8. **`test_trainable_parameter_handling`**: Tests trainable vs frozen
parameters work correctly in pt_expt

## Verification

All tests pass:

```bash
# dpmodel EmbeddingNet tests
python -m pytest source/tests/common/dpmodel/test_network.py::TestEmbeddingNet -v
# 4 passed in 0.41s

# pt_expt EmbeddingNet integration tests
python -m pytest source/tests/pt_expt/utils/test_network.py::TestEmbeddingNetRefactor -v
# 8 passed in 0.41s

# All pt_expt network tests
python -m pytest source/tests/pt_expt/utils/test_network.py -v
# 10 passed in 0.41s

# Descriptor tests (verify refactoring doesn't break existing code)
python -m pytest source/tests/pt_expt/descriptor/test_se_e2_a.py -v -k consistency
# 1 passed

python -m pytest source/tests/universal/pt_expt/descriptor/test_descriptor.py -v
# 8 passed in 3.27s
```

## Benefits

1. **Type-based auto-detection**: No more name-based special cases in
`__setattr__`
2. **Maintainability**: Single source of truth for EmbeddingNet in
dpmodel
3. **Consistency**: Same pattern as other dpmodel classes
(AtomExcludeMask, NetworkCollection, etc.)
4. **Future-proof**: New attributes in dpmodel automatically work in
pt_expt via registry

## Backward Compatibility

- Serialization format unchanged (version 2.1)
- All existing tests pass
- `make_embedding_network` factory kept for pt/pd backends
- No changes to public API

## Files Changed

### Modified

- `deepmd/dpmodel/utils/network.py`: Concrete EmbeddingNet class +
deserialize fix
- `deepmd/pt_expt/utils/network.py`: EmbeddingNet wrapper + registration
- `deepmd/pt_expt/utils/type_embed.py`: Simplified to use registry
- `source/tests/common/dpmodel/test_network.py`: Added dpmodel
EmbeddingNet tests (3 new tests)
- `source/tests/pt_expt/utils/test_network.py`: Added pt_expt
integration tests (8 new tests)

### No changes required

- All descriptor wrappers (se_e2_a, se_r, se_t, se_t_tebd) automatically
work via registry
- No changes to dpmodel logic or array_api_compat code


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added PyTorch compatibility layer enabling DPModel neural network
components to be used with PyTorch workflows for training and inference
* Enhanced embedding network with explicit serialization and
deserialization capabilities

* **Refactor**
* Restructured embedding network with explicit class design for improved
type stability and control flow management
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
github-merge-queue Bot pushed a commit that referenced this pull request Feb 11, 2026
# FittingNet Refactoring: Factory Function to Concrete Class

## Summary

This refactoring converts `FittingNet` from a factory-generated dynamic
class to a concrete class in the dpmodel backend, following the same
pattern as the EmbeddingNet refactoring. This enables the auto-detection
registry mechanism in pt_expt to work seamlessly with FittingNet.

This PR is considered after
#5194 and
#5204

## Motivation

**Before**: `FittingNet` was created by a factory function
`make_fitting_network(EmbeddingNet, NativeNet, NativeLayer)`, producing
a dynamically-typed class. This caused:

1. **Cannot be registered**: Dynamic classes can't be imported or
registered at module import time in the pt_expt registry
2. **Type matching fails**: Each call to `make_fitting_network` creates
a new class type, so registry lookup by type fails

**After**: `FittingNet` is now a concrete class that can be registered
in the pt_expt auto-conversion registry.

## Changes

### 1. dpmodel: Concrete `FittingNet` class

**File**: `deepmd/dpmodel/utils/network.py`

- Created concrete `FittingNet(EmbeddingNet)` class
- Moved constructor logic from factory into `__init__`
- Fixed `deserialize` to use `type(obj.layers[0])` instead of hardcoding
`T_Network.__init__(obj, layers)`, allowing pt_expt subclass to preserve
its converted torch layers
- Kept `make_fitting_network` factory for backwards compatibility (for
pt/pd backends)

```python
class FittingNet(EmbeddingNet):
    """The fitting network."""

    def __init__(self, in_dim, out_dim, neuron=[24, 48, 96],
                 activation_function="tanh", resnet_dt=False,
                 precision=DEFAULT_PRECISION, bias_out=True,
                 seed=None, trainable=True):
        # Handle trainable parameter
        if trainable is None:
            trainable = [True] * (len(neuron) + 1)
        elif isinstance(trainable, bool):
            trainable = [trainable] * (len(neuron) + 1)

        # Initialize embedding layers via parent
        super().__init__(
            in_dim, neuron=neuron,
            activation_function=activation_function,
            resnet_dt=resnet_dt, precision=precision,
            seed=seed, trainable=trainable[:-1]
        )

        # Add output layer
        i_in = neuron[-1] if len(neuron) > 0 else in_dim
        self.layers.append(
            NativeLayer(
                i_in, out_dim, bias=bias_out,
                use_timestep=False, activation_function=None,
                resnet=False, precision=precision,
                seed=child_seed(seed, len(neuron)),
                trainable=trainable[-1]
            )
        )
        self.out_dim = out_dim
        self.bias_out = bias_out

    @classmethod
    def deserialize(cls, data):
        data = data.copy()
        check_version_compatibility(data.pop("@Version", 1), 1, 1)
        data.pop("@Class", None)
        layers = data.pop("layers")
        obj = cls(**data)
        # Use type(obj.layers[0]) to respect subclass layer types
        layer_type = type(obj.layers[0])
        obj.layers = type(obj.layers)(
            [layer_type.deserialize(layer) for layer in layers]
        )
        return obj
```

### 2. pt_expt: Wrapper and registration

**File**: `deepmd/pt_expt/utils/network.py`

- Added import: `from deepmd.dpmodel.utils.network import FittingNet as
FittingNetDP`
- Created `FittingNet(FittingNetDP, torch.nn.Module)` wrapper
- Converts dpmodel layers to pt_expt `NativeLayer` (torch modules) in
`__init__`
- Registered in auto-conversion registry

```python
from deepmd.dpmodel.utils.network import FittingNet as FittingNetDP

class FittingNet(FittingNetDP, torch.nn.Module):
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        torch.nn.Module.__init__(self)
        FittingNetDP.__init__(self, *args, **kwargs)
        # Convert dpmodel layers to pt_expt NativeLayer
        self.layers = torch.nn.ModuleList(
            [NativeLayer.deserialize(layer.serialize()) for layer in self.layers]
        )

    def __call__(self, *args: Any, **kwargs: Any) -> Any:
        return torch.nn.Module.__call__(self, *args, **kwargs)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return self.call(x)

register_dpmodel_mapping(
    FittingNetDP,
    lambda v: FittingNet.deserialize(v.serialize()),
)
```

## Tests

### dpmodel tests

**File**: `source/tests/common/dpmodel/test_network.py`

Added to `TestFittingNet` class:

1. **`test_fitting_net`**: Original roundtrip serialization test
(already existed)
2. **`test_is_concrete_class`**: Verifies `FittingNet` is now a concrete
class, not factory output
3. **`test_forward_pass`**: Tests dpmodel forward pass produces correct
output shapes (single and batch)
4. **`test_trainable_parameter_variants`**: Tests different trainable
configurations (all trainable, all frozen, mixed)

### pt_expt integration tests

**File**: `source/tests/pt_expt/utils/test_network.py`

Created `TestFittingNetRefactor` test suite with 4 tests:

1. **`test_pt_expt_fitting_net_wraps_dpmodel`**: Verifies pt_expt
wrapper inherits correctly and converts layers
2. **`test_pt_expt_fitting_net_forward`**: Tests pt_expt forward pass
returns torch.Tensor with correct shape
3. **`test_serialization_round_trip_pt_expt`**: Tests pt_expt
serialize/deserialize round-trip
4. **`test_registry_converts_dpmodel_to_pt_expt`**: Tests
`try_convert_module` auto-converts dpmodel to pt_expt

## Verification

All tests pass:

```bash
# dpmodel network tests (includes new FittingNet tests)
python -m pytest source/tests/common/dpmodel/test_network.py -v
# 19 passed in 0.56s (was 16, added 3 FittingNet tests)

# dpmodel FittingNet tests specifically
python -m pytest source/tests/common/dpmodel/test_network.py::TestFittingNet -v
# 4 passed in 0.44s

# pt_expt network tests (EmbeddingNet + FittingNet)
python -m pytest source/tests/pt_expt/utils/test_network.py -v
# 14 passed in 0.45s

# Descriptor tests (verify refactoring doesn't break existing code)
python -m pytest source/tests/pt_expt/descriptor/ -v
# 8 passed in 5.43s
```

## Benefits

1. **Type-based auto-detection**: FittingNet now works with the registry
mechanism
2. **Consistency**: Same pattern as EmbeddingNet and other dpmodel
classes
3. **Maintainability**: Single source of truth for FittingNet in dpmodel
4. **Future-proof**: Any dpmodel FittingNet instances can be
auto-converted to pt_expt

## Backward Compatibility

- Serialization format unchanged (version 1)
- All existing tests pass
- `make_fitting_network` factory kept for pt/pd backends
- No changes to public API

## Files Changed

### Modified

- `deepmd/dpmodel/utils/network.py`: Concrete FittingNet class +
deserialize fix
- `deepmd/pt_expt/utils/network.py`: FittingNet wrapper + registration
- `source/tests/common/dpmodel/test_network.py`: Added dpmodel
FittingNet tests (3 new tests)
- `source/tests/pt_expt/utils/test_network.py`: Added pt_expt
integration tests (4 new tests)

### Pattern

This refactoring follows the exact same pattern as
`EMBEDDING_NET_REFACTOR.md`:

1. Convert factory-generated class to concrete class in dpmodel
2. Fix `deserialize` to use `type(obj.layers[0])`
3. Create pt_expt wrapper with layer conversion in `__init__`
4. Register with `register_dpmodel_mapping`
5. Add comprehensive tests


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added PyTorch experimental descriptor implementations for SeT and
SeTTebd with full export/tracing support
* Introduced PyTorch-compatible wrapper classes for network components
enabling seamless integration with PyTorch workflows

* **Improvements**
* Enhanced device-aware tensor operations across all descriptors for
better multi-device support
* Improved error handling with explicit error messages when statistics
are missing instead of silent failures
* Refactored FittingNet as a concrete class with explicit public
interface

* **Tests**
* Added comprehensive test coverage for new PyTorch experimental
descriptors and network wrappers
* Added unit tests validating serialization, deserialization, and
forward pass behavior
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
github-merge-queue Bot pushed a commit that referenced this pull request Feb 11, 2026
This PR is considered after
#5194
#5204 and
#5205


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added experimental PyTorch support for SeT and SeT-TEBD descriptors,
enabling model training and serialization/export.
* Introduced TypeEmbedNet wrapper for type embedding integration in
PyTorch workflows.

* **Bug Fixes**
* Improved backend compatibility and device-aware tensor allocation
across descriptor implementations.
  * Fixed PyTorch tensor indexing compatibility issues.

* **Tests**
* Added comprehensive test coverage for new experimental descriptors and
consistency validation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants