Skip to content

Add resuming support in AFE Protocols#1808

Open
IAlibay wants to merge 15 commits intomainfrom
resume-afe
Open

Add resuming support in AFE Protocols#1808
IAlibay wants to merge 15 commits intomainfrom
resume-afe

Conversation

@IAlibay
Copy link
Member

@IAlibay IAlibay commented Jan 22, 2026

Fixes #1725
Similar to #1774

Checklist

  • All new code is appropriately documented (user-facing code must have complete docstrings).
  • Added a news entry, or the changes are not user-facing.
  • Ran pre-commit: you can run pre-commit locally or comment on this PR with pre-commit.ci autofix.

Manual Tests: these are slow so don't need to be run every commit, only before merging and when relevant changes are made (generally at reviewer-discretion).

Developers certificate of origin

@codecov
Copy link

codecov bot commented Jan 22, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 155 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.20%. Comparing base (d69baa6) to head (68a2bc3).

Files with missing lines Patch % Lines
...fe/tests/protocols/openmm_ahfe/test_ahfe_resume.py 40.09% 124 Missing ⚠️
src/openfe/protocols/openmm_afe/base_afe_units.py 77.19% 13 Missing ⚠️
src/openfe/tests/protocols/conftest.py 64.28% 10 Missing ⚠️
...sts/protocols/openmm_rfe/test_hybrid_top_resume.py 0.00% 7 Missing ⚠️
src/openfe/protocols/openmm_rfe/hybridtop_units.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1808      +/-   ##
==========================================
- Coverage   94.79%   91.20%   -3.60%     
==========================================
  Files         205      206       +1     
  Lines       17957    18241     +284     
==========================================
- Hits        17022    16636     -386     
- Misses        935     1605     +670     
Flag Coverage Δ
fast-tests 91.20% <50.00%> (?)
slow-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@IAlibay IAlibay marked this pull request as ready for review March 11, 2026 08:00
@IAlibay
Copy link
Member Author

IAlibay commented Mar 11, 2026

pre-commit.ci autofix

@IAlibay
Copy link
Member Author

IAlibay commented Mar 11, 2026

pre-commit.ci autofix

return reporter

@staticmethod
def _get_sampler(
Copy link
Member Author

@IAlibay IAlibay Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is annoyingly close to the RFE one, but also annoyingly just a bit different.. (because sampler building goes through extra hoops in hybrid topology sims). It might be hard to align them in the future without fixing sampler.create in hybrid topology sims.

@IAlibay
Copy link
Member Author

IAlibay commented Mar 12, 2026

pre-commit.ci autofix

@IAlibay
Copy link
Member Author

IAlibay commented Mar 12, 2026

pre-commit.ci autofix

Comment on lines +806 to +808
"openmm_version": openmm.__version__,
"openfe_version": openfe.__version__,
"gufe_version": gufe.__version__,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at aligning how the protocols do this, the Htop method does it in the units result dict in run rather then in execute

"openmm_version": openmm.__version__,
"openfe_version": openfe.__version__,
"gufe_version": gufe.__version__,
we should probably align them and do it in run?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I thought I moved it on htop to also do it in execute!

So my idea here is that execute should do all the "stuff that handles how units interact with each other in a dag", and run should be solely the self-execution of a Unit.

At the end of the day run is just a convenience so that we can execute a unit with whatever data and get the results. It's not really part of the gufe API. So I think it would be great to keep it as agnostic of the DAG it sits in as possible.

-----
For now this just checks if the netcdf files are present in the
shared directory but in the future this may expand depending on
how warehouse works.
Copy link
Collaborator

@jthorton jthorton Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing raises from the docstring

Comment on lines +830 to +832
"openmm_version": openmm.__version__,
"openfe_version": openfe.__version__,
"gufe_version": gufe.__version__,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I should have scrolled down they are aligned, good job!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I did the same thing you did!

def test_check_restart_one_file_missing(protocol_settings, ahfe_vac_trajectory_path):
protocol_settings.vacuum_output_settings.checkpoint_storage_filename = "foo.nc"

errmsg = "One of either the trajectory or checkpoint files are missing"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After seeing this message here I think it might be more helpful to tell users which one is actually missing?

Co-authored-by: Josh Horton <Josh.Horton@newcastle.ac.uk>
@github-actions
Copy link

No API break detected ✅

Copy link
Collaborator

@jthorton jthorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, I think this is highlighting places where we can deduplicate across protocols in the future as well. Please merge after fixing the missing file error to be more specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add resume support for AFE run unit

2 participants