Skip to content

[Code scan] Propagate frame-level data modifier failures #5690

Description

@njzjz

This issue comes from a Codex global scan of deepmodeling/deepmd-kit at commit 73de44b1f94471b2e3bdb6b11f57b34d7bc791bb.

Problem

DeepmdData.get_single_frame() runs a data modifier through a one-off ThreadPoolExecutor, but never calls future.result():

if self.modifier is not None:
with ThreadPoolExecutor(max_workers=num_worker) as executor:
# Apply modifier if it exists
future = executor.submit(
self.modifier.modify_data,
frame_data,
self,
)
if self.use_modifier_cache:
# Cache the modified frame to avoid recomputation
self._modified_frame_cache[index] = copy.deepcopy(frame_data)

The context manager waits for the submitted task to finish, but exceptions raised inside the future stay stored on the Future object unless result() is called. The same class calls modifiers directly in other paths, so those paths do propagate errors:

if self.modifier is not None:
self.modifier.modify_data(ret, self)
return ret

def _load_batch_set(self, set_name: DPPath) -> None:
if not hasattr(self, "batch_set") or self.get_numb_set() > 1:
self.batch_set = self._load_set(set_name)
if self.modifier is not None:
self.modifier.modify_data(self.batch_set, self)
self.batch_set, _ = self._shuffle_data(self.batch_set)

Impact

Frame-level data loading can silently ignore modifier failures. If modifier.modify_data(...) raises, callers receive the unmodified frame and, when caching is enabled, that bad frame can be cached for future use.

Suggested fix

Call future.result() before leaving the modifier block, or remove the single-task executor and call self.modifier.modify_data(frame_data, self) directly. If parallelism is needed, keep exception propagation explicit before caching the modified frame.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions