Skip to content

[Code scan] System.map_atom_types ignores the current atom order #984

Description

@njzjz

This issue is part of a Codex global repository code scan.

System.map_atom_types() rebuilds the mapped atom sequence from grouped atom_names and atom_numbs instead of mapping the current self.data["atom_types"] sequence. For systems whose atom order is not grouped by type, the returned type array is wrong.

Affected code:

dpdata/dpdata/system.py

Lines 371 to 374 in a7a50bf

atom_types_list = []
for name, numb in zip(self.get_atom_names(), self.get_atom_numbs()):
atom_types_list.extend([name] * numb)
new_atom_types = np.array([type_map[ii] for ii in atom_types_list], dtype=int)

Minimal reproducer:

import numpy as np
import dpdata

data = {
    "atom_names": ["O", "H"],
    "atom_numbs": [1, 2],
    "atom_types": np.array([1, 0, 1]),
    "orig": np.zeros(3),
    "cells": np.eye(3).reshape(1, 3, 3),
    "coords": np.zeros((1, 3, 3)),
}

s = dpdata.System(data=data)
print(s.map_atom_types({"H": 0, "O": 1}).tolist())

Current output:

[1, 0, 0]

Expected output, preserving the current atom order [H, O, H]:

[0, 1, 0]

The implementation should map each entry in self.data["atom_types"] through self.data["atom_names"] instead of expanding grouped counts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions