Skip to content

[Code scan] Close TensorFlow EwaldRecp sessions #5685

Description

@njzjz

This issue comes from a Codex global scan of deepmodeling/deepmd-kit at commit 73de44b1f94471b2e3bdb6b11f57b34d7bc791bb.

Problem

EwaldRecp creates a dedicated TensorFlow session in its constructor:

with tf.Graph().as_default() as graph:
# place holders
self.t_nloc = tf.placeholder(tf.int32, [1], name="t_nloc")
self.t_coord = tf.placeholder(
GLOBAL_TF_FLOAT_PRECISION, [None], name="t_coord"
)
self.t_charge = tf.placeholder(
GLOBAL_TF_FLOAT_PRECISION, [None], name="t_charge"
)
self.t_box = tf.placeholder(GLOBAL_TF_FLOAT_PRECISION, [None], name="t_box")
# output
self.t_energy, self.t_force, self.t_virial = op_module.ewald_recp(
self.t_coord,
self.t_charge,
self.t_nloc,
self.t_box,
ewald_h=self.hh,
ewald_beta=self.beta,
)
self.sess = tf.Session(graph=graph, config=default_tf_session_config)

DipoleChargeModifier creates an EwaldRecp instance and keeps it for modifier evaluation:

# the dipole model is loaded with prefix 'dipole_charge'
self.modifier_prefix = "dipole_charge"
# init dipole model
DeepDipole.__init__(
self, model_name, load_prefix=self.modifier_prefix, default_tf_graph=True
)
self.model_name = model_name
self.model_charge_map = model_charge_map
self.sys_charge_map = sys_charge_map
self.sel_type = list(self.get_sel_type())
# init ewald recp
self.ewald_h = ewald_h
self.ewald_beta = ewald_beta
self.er = EwaldRecp(self.ewald_h, self.ewald_beta)

The scanned modifier and Ewald files do not expose a close method, context-manager path, or destructor that closes EwaldRecp.sess.

Impact

Creating and discarding DipoleChargeModifier / EwaldRecp instances can leak TensorFlow sessions and graph resources. Long-running Python processes that repeatedly construct modifiers can accumulate those resources until process exit.

Suggested fix

Add an explicit close() method to EwaldRecp and have DipoleChargeModifier forward it. Consider context-manager support and a defensive destructor, matching the session lifecycle used elsewhere in the TensorFlow inference stack.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions