Skip to content

Move the water bridge analysis to the new analysis class#2087

Merged
orbeckst merged 30 commits into
MDAnalysis:developfrom
xiki-tempula:new_wba
Aug 13, 2019
Merged

Move the water bridge analysis to the new analysis class#2087
orbeckst merged 30 commits into
MDAnalysis:developfrom
xiki-tempula:new_wba

Conversation

@xiki-tempula

@xiki-tempula xiki-tempula commented Oct 3, 2018

Copy link
Copy Markdown
Contributor

Move the water bridge analysis to the new analysis class, add high order water bridge support.
Rewritte the count_by_type and count_by_time function, add custom anaylsis function support.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Move the water bridge analysis to the new analysis class, add high
    Order water bridge support. (PR #2087)
@xiki-tempula

Copy link
Copy Markdown
Contributor Author

It seems that the line result = [(*key, result_dict[key]*1.0/length) for key in result_dict] is falling in the older versions of python. I wonder if there are any hacks to go around this problem except writing another loop?

@kain88-de

Copy link
Copy Markdown
Member

yes this is not valid syntax for python 3.4 and older. What did you want to replace?

@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@kain88-de
My current thought would be replaced with

result = [[i for i in key] for key in result_dict]
[result[i].append(result_dict[key]*1.0/length) for i, key in enumerate(result_dict)]
return result

@kain88-de

kain88-de commented Oct 3, 2018 via email

Copy link
Copy Markdown
Member

@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@kain88-de It is a tuple has the form
(from_index, to_index, (from_resname, from_resid, from_name), (to_resname, to_resid, to_name)).

@kain88-de

kain88-de commented Oct 3, 2018 via email

Copy link
Copy Markdown
Member

@xiki-tempula

xiki-tempula commented Oct 3, 2018

Copy link
Copy Markdown
Contributor Author

@kain88-de Sorry, I haven't made this clear. This is the default output version. In this update, the user has the freedom to define the key to this dictionary.
The current output is formatted so that it is compatible with the previous version. Since the output is not used by other module in the mda. Perhaps, I could just do [[k, v/length] for k, v in six.iteritems(result_dict)].

Comment thread package/MDAnalysis/analysis/hbonds/wbridge_analysis.py
Comment thread package/MDAnalysis/analysis/hbonds/wbridge_analysis.py
@codecov

codecov Bot commented Oct 5, 2018

Copy link
Copy Markdown

Codecov Report

Merging #2087 into develop will increase coverage by 0.07%.
The diff coverage is 91.05%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2087      +/-   ##
===========================================
+ Coverage    89.75%   89.82%   +0.07%     
===========================================
  Files          173      173              
  Lines        21536    21711     +175     
  Branches      2804     2846      +42     
===========================================
+ Hits         19329    19502     +173     
  Misses        1615     1615              
- Partials       592      594       +2
Impacted Files Coverage Δ
...age/MDAnalysis/analysis/hbonds/wbridge_analysis.py 91.54% <91.05%> (+5.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7854011...ebf2fb2. Read the comment docs.

@orbeckst

Copy link
Copy Markdown
Member

The doc test had stalled. I restarted it. Then we should also see coverage.

Please also fix the conflicts with develop.

@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@orbeckst Thank you for the advice. I have fixed the conflict but I think the py36 develop build might have some problem which makes the test fails.

@orbeckst

Copy link
Copy Markdown
Member

I restarted the 3.6 jobs, maybe that helps?

@orbeckst

Copy link
Copy Markdown
Member

If not, drop an email to the dev list; I am not sure what's wrong.

@orbeckst

Copy link
Copy Markdown
Member

I just merged @richardjgowers PR #2145 so rebase against develop (or merge develop) and see if this fixes things.

@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@orbeckst Thank you for the information. Would you mind restart the Travis build, please? I guess there we probably don't need another commit.

@orbeckst

Copy link
Copy Markdown
Member

Did you merge develop? This is needed for the change to take place. This will automatically run Travis.

@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@orbeckst Thank you for the advice. I have merged the develop to this PR and it seems that MDAnalysisTests.analysis.test_encore.TestEncore is failing.

@orbeckst

orbeckst commented Nov 22, 2018 via email

Copy link
Copy Markdown
Member

@orbeckst

Copy link
Copy Markdown
Member

I restarted the one job that failed.

@orbeckst orbeckst left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a big PR. It mixes new features (higher order wbridges) with refactoring (AnalysisBase). It would have been easier if these two things had been addressed separately. In particular, new features automatically imply that this has to go into a 0.20.x (because we use semantic versioning.

Many of my comments are regarding documentation, mainly to make clearer what happens in the module.

I am also concerned with code duplication between the new wbridge analysis and the hbond analysis. I am not 100% sure what the best course forward is but I'd like to hear suggestions. Perhaps create a mixin-class and raise an issue for refactoring hbond analysis; not everything has to be accomplished in one PR (many smaller PRs with weel-defined goals are much better).

In the tests I'd also suggest to reduce code-duplication by using more fixtures as this also leads to cleaner separation of data and test logic.

ADDENDUM

  • remove file package/.DS_Store from the PR
  • rebase against develop (or merge develop)

Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py
Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py
raise pytest.fail("selection_type aaa should rasie error")

def test_empty_selection(self):
grofile = '''Test gro file

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it a fixture

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


def test_loop(self):
'''Test if loop can be handled correctly'''
grofile = '''Test gro file

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it a fixture

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py
Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py Outdated
Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py Outdated
Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py Outdated
return Atomtypes(np.array([guess_atom_type(name) for name in names], dtype=object))


class TestHydrogenBondAnalysis(object):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be in a different test?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is the name wrong?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst I'm thinking of unifying the output of HydrogenBondAnalysis and WaterBridgeAnalysis. So I have transported the HydrogenBondAnalysis to here to make sure WaterBridgeAnalysis can pass the tests which were written for HydrogenBondAnalysis.

Comment thread testsuite/MDAnalysisTests/analysis/test_wbridge.py Outdated
@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@orbeckst Sorry for bothering you in this busy season of GSoC, I have submitted a commit which passes all the test. Would you mind give a review, please? Thank you. Once this PR is merged, we can probably be thinking of how to use capped_distance to improve the performance.

@orbeckst

Copy link
Copy Markdown
Member

@xiki-tempula , can you please go through the comments above and for each comment reply how you addressed it or if you decided to not address it an why? Think of it as addressing revisions in a paper. Don't click the "Resolved" button on the comments. I will do this after having read your replies.

Given the decisions regarding refactoring H-bond analysis in #2238 , you don't need to worry about code duplication between HydrogenBondAnalysis and WaterBridgeAnalysis for the moment. This might come up later with the new module.

Reading your summary of changes will make it a lot easier for me to go through things. The PR is so big that I don't have the time to re-read all the lines of code and compare to what I said in December. (As a general rule: the smaller the PR the better the chance that someone can review it in a timely manner.)

Ping me when you're done and I will review your comments and code where necessary.

Thanks!

@xiki-tempula

xiki-tempula commented Aug 5, 2019

Copy link
Copy Markdown
Contributor Author

@orbeckst Thank you for the comments. I have added my comments to the comments. Most of the issue is with the use of the fixture. Now, all the universe used in the test section is made into a fixture and is listed on the top of the test section.

There are some lines in the test which exceeds 80 characters. I can fix that but I prefer to do that when things are finalised.

@orbeckst orbeckst left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just address these minor issues (see comments):

  • remove .DS_Store files – add .DS_Store to .gitignore
  • small documentation issues (see comments)
  • very minor code style issue

# hbonds linking the selection 1 and selection 2 to the bridging
# water 1
[ # hbond 1 from selection 1 to the bridging water 1
<donor index (0-based)>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • add the versionchanged to the class
  • add an entry to CHANGELOG's Changes stating that the output format of WaterBridgeAnalysis changed


.. _pandas.DataFrame: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
Note that the result is arranged in the format of (key, proportion of time). When no
custom analysis function is supplied, the key is expended for backward compatibility.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • fix expended -> expanded (?)
  • state what it is backward compatible to

key = (s1_resname, s1_resid, s2_resname, s2_resid)
output[key] += 1

w.count_by_type(analysis_func=analysis)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the "in-place" modification requirement for the analysis() function.

Comment thread package/MDAnalysis/analysis/hbonds/wbridge_analysis.py
Comment thread package/MDAnalysis/analysis/hbonds/wbridge_analysis.py
As the _timeseries to timeserie conversion will be deprecated in 1.0
this function will automatically lose its value.
"""
def _expand_timeseries(self, entry, output_format=None):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "old" mean? Make this precise with version information, e.g. "format for release up to 0.19.2"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation has been added.

# atom1 is hydrogen bond donor, position not swapped.
atom1, atom2 = atom1, atom2
else:
raise KeyError('Only \'sele1_sele2\' or \'donor_acceptor\' are allowed as output format.')

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double quotation marks to get rid of ugly backslash escapes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double quotations added.

See Also
--------
:attr:`table` : structured array of the data
.. versionchanged:: 0.20.0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say what changed

.. versionchanged:: 0.20.0
   The output format was changed bla bla...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change has been mentioned

@orbeckst

orbeckst commented Aug 9, 2019

Copy link
Copy Markdown
Member

I merged develop into it and fixed the merge conflict in CHANGELOG; note that your GitHub handle @xiki-tempula was already in the authors list for 0.20.0.

@xiki-tempula

Copy link
Copy Markdown
Contributor Author

@orbeckst I have made the corrections. All the lines in the test file which are too long are being altered. In the main file, though some of the lines are a bit too long, I think having them in the same line might make it easier to read.

@orbeckst orbeckst left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a lot of work but finally: LGTM!

Thanks for being patient and addressing all issues.

@orbeckst orbeckst merged commit e910a12 into MDAnalysis:develop Aug 13, 2019
@xiki-tempula xiki-tempula deleted the new_wba branch August 13, 2019 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants