Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,10 @@ Enhancements
* added functionality to write files in compressed form(gz,bz2). (Issue #2216,
PR #2221)
* survival probability additions: residues, intermittency, step with performance,
(PR #2226)

(PR #2226)
* add timeseries() method to all readers to extract all coordinates into
a numpy array (see PR #1400)

Changes
* added official support for Python 3.7 (PR #1963)
* stopped official support of Python 3.4 (#2066, PR #2174)
Expand Down
64 changes: 64 additions & 0 deletions package/MDAnalysis/coordinates/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1692,6 +1692,70 @@ def __repr__(self):
natoms=self.n_atoms
))

def timeseries(self, asel=None, start=None, stop=None, step=None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, but maybe it should be called coordinates_timeseries? unless it's obvious that the only timeseries of interest is the coordinates? I also think maybe the function should be made more general to be a timeseries of all time data, so box info, velocities etc? This would make the Reader -> MemoryReader transformation trivial. So returning something like a numpy structured array? Not sure what a good container for this is though...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, it's not general enough at the moment. However, it is consistent with the MemoryReader

def timeseries(self, asel=None, start=0, stop=-1, step=1, order='afc', format=None):
"""Return a subset of coordinate data for an AtomGroup in desired
column order/format. If no selection is given, it will return a view of
the underlying array, while a copy is returned otherwise.

order='fac'):
"""Return a subset of coordinate data for an AtomGroup

Parameters
----------
asel : AtomGroup (optional)
The :class:`~MDAnalysis.core.groups.AtomGroup` to read the
coordinates from. Defaults to ``None``, in which case the full set of
coordinate data is returned.
start : int (optional)
Begin reading the trajectory at frame index `start` (where 0 is the index
of the first frame in the trajectory); the default ``None`` starts
at the beginning.
stop : int (optional)
End reading the trajectory at frame index `stop`-1, i.e, `stop` is excluded.
The trajectory is read to the end with the default ``None``.
step : int (optional)
Step size for reading; the default ``None`` is equivalent to 1 and means to
read every frame.
order : str (optional)
the order/shape of the return data array, corresponding
to (a)tom, (f)rame, (c)oordinates all six combinations
of 'a', 'f', 'c' are allowed ie "fac" - return array
where the shape is (frame, number of atoms,
coordinates)

.. note:: Only `"fac"` implemented.


See Also
--------
MDAnalysis.coordinates.memory


.. versionadded:: 0.20.0
"""
start, stop, step = self.check_slice_indices(start, stop, step)
nframes = len(range(start, stop, step))

if asel is not None:
if len(asel) == 0:
raise NoDataError(
"Timeseries requires at least one atom to analyze")
atom_numbers = asel.indices
natoms = len(atom_numbers)
else:
atom_numbers = None
natoms = self.n_atoms

if not order in ('fac', None):
# need to add swapping around axes etc
# see MemoryReader.timeseries() and lib.formats.libdcd.DCDfile.readframe()
raise NotImplementedError

# allocate output array
coordinates = np.empty((nframes, natoms, 3), dtype=np.float32)
for i, ts in enumerate(self[start:stop:step]):
# if atom_number == None, this will cause view of array
# do we need copy in this case?
coordinates[i] = ts.positions[atom_numbers].copy()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be double copying in the case of a non-memoryview. I think maybe either np.array or np.asarray lets you specify copy=True which will end up with one copy max.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right.

But note that you gave the MemoryReader its own timeseries method

def timeseries(self, asel=None, start=0, stop=-1, step=1, order='afc', format=None):

Would be nice if we could do everything consistently in base.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I think I misunderstood your comment.... you were not referring to MemoryReader at all, were you?

@orbeckst orbeckst Jun 11, 2019

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.asanyarray() does not have copy=True. np.array(a, copy=True) is equivalent to np.copy(a).

Do you want to do something like

x = ts.positions[atom_numbers]
if is_copy(x):
   coordinates[i] = x
else:
   coordinates[i] = x.copy()

(and where I don't know how to do is_copy(x))?


return coordinates

def add_auxiliary(self, auxname, auxdata, format=None, **kwargs):
"""Add auxiliary data to be read alongside trajectory.

Expand Down
39 changes: 24 additions & 15 deletions package/MDAnalysis/coordinates/memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,45 +90,55 @@
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The :class:`MemoryReader` provides great flexibility because it
becomes possible to create a :class:`~MDAnalysis.core.universe.Universe` directly
from a numpy array.
becomes possible to create a
:class:`~MDAnalysis.core.universe.Universe` directly from a numpy
array. Using the :class:`MemoryReader` is signified by providing a
coordinate array instead of a trajectory file and setting the
``format`` keyword to "MEMORY" (or to the :class:`MemoryReader` class
itself): either ::

u = Universe(..., array, format="MEMORY")

or ::

from MDAnalysis.coordinates.memory import MemoryReader
u = Universe(..., array, format=MemoryReader)

will work.

A simple example consists of a new universe created from the array
extracted from a DCD
:meth:`~MDAnalysis.coordinates.DCD.DCDReader.timeseries`::
extracted from a :meth:`~MDAnalysis.coordinates.base.ProtoReader.timeseries`::

import MDAnalysis as mda
from MDAnalysisTests.datafiles import DCD, PSF
from MDAnalysis.coordinates.memory import MemoryReader

universe = mda.Universe(PSF, DCD)

coordinates = universe.trajectory.timeseries(universe.atoms)
universe2 = mda.Universe(PSF, coordinates, format=MemoryReader, order='afc')
universe2 = mda.Universe(PSF, coordinates, format="MEMORY", order='afc')


.. _create-in-memory-trajectory-with-AnalysisFromFunction:

.. rubric:: Creating an in-memory trajectory with
:func:`~MDAnalysis.analysis.base.AnalysisFromFunction`

The :meth:`~MDAnalysis.coordinates.DCD.DCDReader.timeseries` is
currently only implemented for the
:class:`~MDAnalysis.coordinates.DCD.DCDReader`. However, the
The :meth:`~MDAnalysis.coordinates.base.ProtoReader.timeseries` is
available for all formats. However, as an alternative the
:func:`MDAnalysis.analysis.base.AnalysisFromFunction` can provide the
same functionality for any supported trajectory format::
same functionality and the example shows how one can manually extract
coordinates and load them into a new universe::

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PDB, XTC

from MDAnalysis.coordinates.memory import MemoryReader
from MDAnalysis.analysis.base import AnalysisFromFunction

u = mda.Universe(PDB, XTC)

coordinates = AnalysisFromFunction(lambda ag: ag.positions.copy(),
u.atoms).run().results
u2 = mda.Universe(PDB, coordinates, format=MemoryReader)
u2 = mda.Universe(PDB, coordinates, format="MEMORY")

.. _creating-in-memory-trajectory-label:

Expand All @@ -147,7 +157,6 @@
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PDB, XTC

from MDAnalysis.coordinates.memory import MemoryReader
from MDAnalysis.analysis.base import AnalysisFromFunction

u = mda.Universe(PDB, XTC)
Expand All @@ -156,7 +165,7 @@
coordinates = AnalysisFromFunction(lambda ag: ag.positions.copy(),
protein).run().results
u2 = mda.Merge(protein) # create the protein-only Universe
u2.load_new(coordinates, format=MemoryReader)
u2.load_new(coordinates, format="MEMORY")

The protein coordinates are extracted into ``coordinates`` and then
the in-memory trajectory is loaded from these coordinates. In
Expand All @@ -165,7 +174,7 @@
u2 = mda.Merge(protein).load_new(
AnalysisFromFunction(lambda ag: ag.positions.copy(),
protein).run().results,
format=MemoryReader)
format="MEMORY")

The new :class:`~MDAnalysis.core.universe.Universe` ``u2`` can be used
to, for instance, write out a new trajectory or perform fast analysis
Expand Down