Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/model/dpa2.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ When using the JAX backend, 2 or more MPI ranks are not supported. One must set
atom_modify map yes
```

See the example `examples/water/lmp/jax_dpa2.lammps`.
See the example `examples/water/lmp/jax_dpa.lammps`.

## Data format

Expand Down
76 changes: 76 additions & 0 deletions doc/model/dpa3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Descriptor DPA-3 {{ pytorch_icon }} {{ jax_icon }} {{ dpmodel_icon }}

:::{note}
**Supported backends**: PyTorch {{ pytorch_icon }}, JAX {{ jax_icon }}, DP {{ dpmodel_icon }}
:::

DPA-3 is an advanced interatomic potential leveraging the message passing architecture.
Designed as a large atomic model (LAM), DPA-3 is tailored to integrate and simultaneously train on datasets from various disciplines,
encompassing diverse chemical and materials systems across different research domains.
Its model design ensures exceptional fitting accuracy and robust generalization both within and beyond the training domain.
Furthermore, DPA-3 maintains energy conservation and respects the physical symmetries of the potential energy surface,
making it a dependable tool for a wide range of scientific applications.

Reference: will be released soon.

Training example: `examples/water/dpa3/input_torch.json`.

## Hyperparameter tests

We systematically conducted DPA-3 training on six representative DFT datasets (available at [AIS-Square](https://www.aissquare.com/datasets/detail?pageType=datasets&name=DPA3_hyperparameter_search&id=316)):
metallic systems (`Alloy`, `AlMgCu`, `W`), covalent material (`Boron`), molecular system (`Drug`), and liquid water (`Water`).
Under consistent training conditions (0.5M training steps, batch_size "auto:128"),
we rigorously evaluated the impacts of some critical hyperparameters on validation accuracy.

The comparative analysis focused on average RMSEs (Root Mean Square Error) for both energy, force and virial predictions across all six systems,
with results tabulated below to guide scenario-specific hyperparameter selection:

| Model | comment | nlayers | n_dim | e_dim | a_dim | e_sel | a_sel | start_lr | stop_lr | loss prefactors | rmse_e (meV/atom) | rmse_f (meV/Å) | rmse_v (meV/atom) | Training wall time (h) |
| ---------------- | --------------- | ------- | ------- | ------ | ----- | ------- | ------ | -------- | -------- | ------------------------- | ----------------- | -------------- | ----------------- | ---------------------- |
| DPA3-L3 | Default | 3 | 256 | 128 | 32 | 120 | 30 | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 5.74 | 85.4 | 43.1 | 9.8 |
| | Small dimension | 3 | **128** | **64** | 32 | 120 | 30 | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 6.99 | 93.6 | 46.7 | 8.0 |
| | Large sel | 3 | 256 | 128 | 32 | **154** | **48** | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 5.70 | 83.7 | 43.4 | 14.1 |
| DPA3-L6 | Default | 6 | 256 | 128 | 32 | 120 | 30 | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 4.85 | 79.9 | 39.7 | 19.2 |
| | Small dimension | 6 | **128** | **64** | 32 | 120 | 30 | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 5.11 | 77.7 | 41.2 | 14.1 |
| | Large sel | 6 | 256 | 128 | 32 | **154** | **48** | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 4.76 | 78.4 | 40.2 | 31.8 |
| DPA2-L6 (medium) | Default | 6 | - | - | - | - | - | 1e-3 | 3.51e-08 | 0.02\|1, 1000\|1, 0.02\|1 | 12.12 | 109.3 | 83.1 | 12.2 |

The loss prefactors (0.2|20, 100|60, 0.02|1) correspond to (`start_pref_e`|`limit_pref_e`, `start_pref_f`|`limit_pref_f`, `start_pref_v`|`limit_pref_v`) respectively.
Virial RMSEs were averaged exclusively for systems containing virial labels (`Alloy`, `AlMgCu`, `W`, and `Boron`).

Note that we set `float32` in all DPA-3 models, while `float64` in other models by default.

## Requirements of installation from source code {{ pytorch_icon }}

To run the DPA-3 model on LAMMPS via source code installation
(users can skip this step if using [easy installation](../install/easy-install.md)),
the custom OP library for Python interface integration must be compiled and linked
during the [model freezing process](../freeze/freeze.md).

The customized OP library for the Python interface can be installed by setting environment variable {envvar}`DP_ENABLE_PYTORCH` to `1` during installation.

If one runs LAMMPS with MPI, the customized OP library for the C++ interface should be compiled against the same MPI library as the runtime MPI.
If one runs LAMMPS with MPI and CUDA devices, it is recommended to compile the customized OP library for the C++ interface with a [CUDA-Aware MPI](https://developer.nvidia.com/mpi-solutions-gpus) library and CUDA,
otherwise the communication between GPU cards falls back to the slower CPU implementation.

## Limitations of the JAX backend with LAMMPS {{ jax_icon }}

When using the JAX backend, 2 or more MPI ranks are not supported. One must set `map` to `yes` using the [`atom_modify`](https://docs.lammps.org/atom_modify.html) command.

```lammps
atom_modify map yes
```

See the example `examples/water/lmp/jax_dpa.lammps`.

## Data format

DPA-3 supports both the [standard data format](../data/system.md) and the [mixed type data format](../data/system.md#mixed-type).

## Type embedding

Type embedding is within this descriptor with the same dimension as the node embedding: {ref}`n_dim <model[standard]/descriptor[dpa3]/repflow/n_dim>` argument.

## Model compression

Model compression is not supported in this descriptor.
1 change: 1 addition & 0 deletions doc/model/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Model
train-se-e3
train-se-atten
dpa2
dpa3
train-hybrid
sel
train-energy
Expand Down
1 change: 1 addition & 0 deletions doc/model/train-energy-spin.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ In PyTorch/DP, the spin implementation is more flexible and so far supports the
- `se_e2_a`
- `dpa1`(`se_atten`)
- `dpa2`
- `dpa3`

See `se_e2_a` examples in `$deepmd_source_dir/examples/spin/se_e2_a/input_torch.json`, the {ref}`spin <model/spin>` section is defined as the following with a much more clear interface:

Expand Down
4 changes: 4 additions & 0 deletions examples/water/dpa3/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Input for the DPA-3 model

This directory stores configuration files for training the 6-layer DPA-3 model.
For comprehensive hyperparameter selection, consult the [DPA-3 documentation](../../../doc/model/dpa3.md/#hyperparameter-tests).
94 changes: 94 additions & 0 deletions examples/water/dpa3/input_torch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
{
"_comment": "that's all",
"model": {
"type_map": [
"O",
"H"
],
"descriptor": {
"type": "dpa3",
"repflow": {
"n_dim": 256,
"e_dim": 128,
"a_dim": 32,
"nlayers": 6,
"e_rcut": 6.0,
"e_rcut_smth": 3.0,
"e_sel": 120,
"a_rcut": 4.0,
"a_rcut_smth": 2.0,
"a_sel": 30,
"axis_neuron": 4,
"skip_stat": true,
"a_compress_rate": 1,
"a_compress_e_rate": 2,
"a_compress_use_split": true,
"update_angle": true,
"update_style": "res_residual",
"update_residual": 0.1,
"update_residual_init": "const"
},
"activation_function": "silut:10.0",
"use_tebd_bias": false,
"precision": "float32",
"concat_output_tebd": false
},
"fitting_net": {
"neuron": [
240,
240,
240
],
"resnet_dt": true,
"precision": "float32",
"activation_function": "silut:10.0",
"seed": 1,
"_comment": " that's all"
},
"_comment": " that's all"
},
"learning_rate": {
"type": "exp",
"decay_steps": 5000,
"start_lr": 0.001,
"stop_lr": 3e-5,
"_comment": "that's all"
},
"loss": {
"type": "ener",
"start_pref_e": 0.2,
"limit_pref_e": 20,
"start_pref_f": 100,
"limit_pref_f": 60,
"start_pref_v": 0.02,
"limit_pref_v": 1,
"_comment": " that's all"
},
"training": {
"stat_file": "./dpa3.hdf5",
"training_data": {
"systems": [
"../data/data_0",
"../data/data_1",
"../data/data_2"
],
"batch_size": 1,
"_comment": "that's all"
},
"validation_data": {
"systems": [
"../data/data_3"
],
"batch_size": 1,
"_comment": "that's all"
},
"numb_steps": 1000000,
"warmup_steps": 0,
"gradient_max_norm": 5.0,
"seed": 10,
"disp_file": "lcurve.out",
"disp_freq": 100,
"save_freq": 2000,
"_comment": "that's all"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
units metal
boundary p p p
atom_style atomic
# Below line is required when using DPA-2 with the JAX backend
# Below line is required when using DPA-2/3 with the JAX backend
atom_modify map yes

neighbor 2.0 bin
Expand Down
1 change: 1 addition & 0 deletions source/tests/common/test_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
p_examples / "water" / "dpa2" / "input_torch_medium.json",
p_examples / "water" / "dpa2" / "input_torch_large.json",
p_examples / "water" / "dpa2" / "input_torch_compressible.json",
p_examples / "water" / "dpa3" / "input_torch.json",
p_examples / "property" / "train" / "input_torch.json",
p_examples / "water" / "se_e3_tebd" / "input_torch.json",
p_examples / "hessian" / "single_task" / "input.json",
Expand Down