fix(tf): close EwaldRecp TensorFlow sessions#5708
Conversation
EwaldRecp creates a dedicated tf.Session in its constructor but exposed no way to close it, and DipoleChargeModifier holds an EwaldRecp without releasing it. Repeatedly constructing and discarding these objects leaked TensorFlow sessions and graph resources in long-running processes. Add close(), context-manager support, and a defensive __del__ to EwaldRecp, and have DipoleChargeModifier.close() forward to the EwaldRecp. Add tests that the session is released after close()/context-manager exit and that the modifier forwards close to its evaluator. Fix deepmodeling#5685
📝 WalkthroughWalkthroughAdds explicit TensorFlow session lifecycle management to ChangesEwaldRecp session cleanup
Estimated code review effort: 2 (Simple) | ~10 minutes Sequence Diagram(s)sequenceDiagram
participant DipoleChargeModifier
participant EwaldRecp
participant TFSession
DipoleChargeModifier->>DipoleChargeModifier: close()
DipoleChargeModifier->>EwaldRecp: er.close()
EwaldRecp->>TFSession: sess.close()
Related Issues: Suggested labels: enhancement, tests Suggested reviewers: njzjz 🐰 A session once left open wide, 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/tf/modifier/dipole_charge.py`:
- Around line 108-113: The DipoleChargeModifier.close method only releases the
Ewald evaluator stored in self.er and leaves the TensorFlow session inherited
from DeepDipoleOld open. Update close() in DipoleChargeModifier to also guard
and close self.sess after closing er, so repeated create/discard cycles fully
release both sessions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 91599b1b-6ed5-4ae4-bbea-09f84b087cc3
📒 Files selected for processing (4)
deepmd/tf/infer/ewald_recp.pydeepmd/tf/modifier/dipole_charge.pysource/tests/tf/test_dipolecharge.pysource/tests/tf/test_ewald.py
| def close(self) -> None: | ||
| """Close the TensorFlow session held by the Ewald reciprocal evaluator.""" | ||
| er = getattr(self, "er", None) | ||
| if er is not None: | ||
| er.close() | ||
|
|
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check whether DeepDipole/DeepEval/BaseModifier define close() or otherwise
# manage self.sess/self.graph lifecycle.
rg -n -C3 'class DeepDipole\b|class DeepEval\b|class BaseModifier\b' --type=py
rg -n -C5 '\bdef close\b' deepmd/tfRepository: deepmodeling/deepmd-kit
Length of output: 161
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the relevant files and inspect the class definitions and close-related methods.
git ls-files 'deepmd/tf/modifier/dipole_charge.py' 'deepmd/tf/**/*.py' | sed -n '1,200p'
echo '--- dipole_charge.py ---'
cat -n deepmd/tf/modifier/dipole_charge.py | sed -n '1,220p'
echo '--- search close/session lifecycle ---'
rg -n -C4 '\bdef close\b|self\.sess|self\.graph|class DeepDipole\b|class DeepEval\b|class BaseModifier\b' deepmd/tfRepository: deepmodeling/deepmd-kit
Length of output: 49678
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the base classes that provide the TF session/graph and whether they already
# define close() or other cleanup hooks.
sed -n '1,260p' deepmd/tf/infer/deep_dipole.py
echo '---'
sed -n '1,260p' deepmd/tf/modifier/base_modifier.py
echo '---'
sed -n '1,260p' deepmd/tf/infer/deep_eval.py | sed -n '1,220p'
echo '---'
rg -n -C3 '\bdef close\b|`@cached_property`\s+def sess\b|self\.sess\s*=|self\.graph\s*=' deepmd/tf/infer/deep_dipole.py deepmd/tf/infer/deep_eval.py deepmd/tf/modifier/base_modifier.pyRepository: deepmodeling/deepmd-kit
Length of output: 11583
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check whether DeepPot/DeepTensor already provide cleanup hooks that DipoleChargeModifier
# would inherit through BaseModifier -> DeepPot -> DeepEval/DeepTensor.
rg -n -C4 '\bdef close\b|class DeepPot\b|class DeepTensor\b|class DeepEvalBackend\b|class DeepEvalOld\b' deepmd/tf/infer/deep_pot.py deepmd/tf/infer/deep_tensor.py deepmd/tf/infer/deep_eval.pyRepository: deepmodeling/deepmd-kit
Length of output: 1079
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find any cleanup hooks on the inherited TF evaluation classes.
rg -n -C3 '\bdef close\b|class DeepPot\b|class DeepTensor\b|class DeepEvalOld\b|class DeepEvalBackend\b' \
deepmd/tf/infer/deep_pot.py \
deepmd/tf/infer/deep_tensor.py \
deepmd/tf/infer/deep_eval.py \
deepmd/tf/infer/deep_dipole.pyRepository: deepmodeling/deepmd-kit
Length of output: 900
Close the modifier's own TF session as well. close() currently releases only self.er; DipoleChargeModifier also keeps self.sess open from DeepDipoleOld, so repeated create/discard cycles still leak the main TensorFlow session. Add a guarded self.sess.close() here too.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/tf/modifier/dipole_charge.py` around lines 108 - 113, The
DipoleChargeModifier.close method only releases the Ewald evaluator stored in
self.er and leaves the TensorFlow session inherited from DeepDipoleOld open.
Update close() in DipoleChargeModifier to also guard and close self.sess after
closing er, so repeated create/discard cycles fully release both sessions.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5708 +/- ##
==========================================
- Coverage 81.97% 81.09% -0.88%
==========================================
Files 959 980 +21
Lines 105748 109369 +3621
Branches 4102 4207 +105
==========================================
+ Hits 86684 88695 +2011
- Misses 17573 19152 +1579
- Partials 1491 1522 +31 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Problem
EwaldRecpcreates a dedicatedtf.Sessionin its constructor but exposes no way to close it (noclose(), context-manager path, or destructor).DipoleChargeModifiercreates and keeps anEwaldRecpinstance without releasing it either. Creating and discarding these objects therefore leaks TensorFlow sessions and graph resources; long-running processes that repeatedly construct modifiers accumulate them until process exit.Fix
Add
close()(idempotent, guarded), context-manager support (__enter__/__exit__), and a defensive__del__toEwaldRecp, and haveDipoleChargeModifier.close()forward to the heldEwaldRecp.Test
test_ewald.pypreviously exercised onlyeval, never lifecycle. New tests assert thatEwaldRecp.close()and context-manager exit release the session (a subsequentevalraises on the closed session), and thatDipoleChargeModifier.close()forwards to itsEwaldRecp.close(). All three fail on the current code withAttributeError(no such methods) and pass after the fix.Fix #5685
Summary by CodeRabbit
New Features
Tests