Table Of Contents

Previous topic

Working with AtomGroups

Next topic

Writing coordinates

This Page

Trajectory analysis

The Universe binds together the static topology (which atoms, how are they connected, what un-changing properties do the atoms possess (such as partial charge), ...) and the changing coordinate information, which is stored in the trajectory.

The length of a trajectory (number of frames) is

len(u.trajectory)

The standard way to assess each time step (or frame) in a trajectory is to iterate over the Universe.trajectory attribute (which is an instance of Reader class):

for ts in u.trajectory:
    print("Frame: %5d, Time: %8.3f ps" % (ts.frame, u.trajectory.time))
    print("Rgyr: %g A" % (u.atoms.radiusOfGyration(), ))

The time attribute contains the current time step. The Reader only contains information about one time step: imagine a cursor or pointer moving along the trajectory file. Where the cursor points, there’s you current coordinates, frame number, and time.

Normally you will collect the data in a list or array, e.g.

Rgyr = []
protein = u.selectAtoms("protein")
for ts in u.trajectory:
   Rgyr.append((u.trajectory.time, protein.radiusOfGyration()))
Rgyr = np.array(Rgyr)

Note

It is important to note that the coordinates and related properties calculated from the coordinates such as the radius of gyration change while selections on static properties (such as protein in the example) do not change when moving through a trajectory: You can define the selection once and then recalculate the property of interest for each frame of the trajectory.

However, if selections contain distance-dependent queries (such as around or point, see selection keywords for more details) then one might have to recalculate the selection for each time step and one would put it inside the loop over frames.

The data can be plotted to give the graph below:

# quick plot
import matplotlib.pyplot as plt
ax = plt.subplot(111)
ax.plot(Rgyr[:,0], Rgyr[:,1], 'r--', lw=2, label=r"$R_G$")
ax.set_xlabel("time (ps)")
ax.set_ylabel(r"radius of gyration $R_G$ ($\AA$)")
ax.figure.savefig("Rgyr.pdf")
plt.draw()

The \(R_G(t)\) increases over time, indicating an opening up of the AdK enzyme.

Exercises 4

  1. Take the functions to calculate \(\theta_\text{NMP}\) and \(\theta_\text{LID}\)

    import numpy as np
    from numpy.linalg import norm
    
    def theta_NMP(u):
        """Calculate the NMP-CORE angle for E. coli AdK in degrees"""
        C = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
        B = u.selectAtoms("resid 90:100 and (backbone or name CB)").centerOfGeometry()
        A = u.selectAtoms("resid 35:55 and (backbone or name CB)").centerOfGeometry()
        BA = A - B
        BC = C - B
        theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
        return np.rad2deg(theta)
    
    def theta_LID(u):
        """Calculate the LID-CORE angle for E. coli AdK in degrees"""
        C = u.selectAtoms("resid 179:185 and (backbone or name CB)").centerOfGeometry()
        B = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
        A = u.selectAtoms("resid 125:153 and (backbone or name CB)").centerOfGeometry()
        BA = A - B
        BC = C - B
        theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
        return np.rad2deg(theta)
    

    and calculate the time series \(\theta_\text{NMP}(t)\) and \(\theta_\text{LID}(t)\).

    Plot them together in one plot.

  2. Plot \(\theta_\text{NMP}(t)\) against \(\theta_\text{LID}(t)\).

    What does the plot show?

    Why could such a plot be useful?

The code to generate the figure contains theta_LID() and theta_NMP().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import numpy as np
from numpy.linalg import norm

def theta_NMP(u):
    """Calculate the NMP-CORE angle for E. coli AdK in degrees"""
    C = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
    B = u.selectAtoms("resid 90:100 and (backbone or name CB)").centerOfGeometry()
    A = u.selectAtoms("resid 35:55 and (backbone or name CB)").centerOfGeometry()
    BA = A - B
    BC = C - B
    theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
    return np.rad2deg(theta)

def theta_LID(u):
    """Calculate the LID-CORE angle for E. coli AdK in degrees"""
    C = u.selectAtoms("resid 179:185 and (backbone or name CB)").centerOfGeometry()
    B = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
    A = u.selectAtoms("resid 125:153 and (backbone or name CB)").centerOfGeometry()
    BA = A - B
    BC = C - B
    theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
    return np.rad2deg(theta)

if __name__ == "__main__":
    import MDAnalysis
    from MDAnalysis.tests.datafiles import PSF, DCD
    import matplotlib
    import matplotlib.pyplot as plt

    u = MDAnalysis.Universe(PSF, DCD)
    data = np.array([(u.trajectory.time, theta_NMP(u), theta_LID(u)) for ts in u.trajectory])
    time, NMP, LID = data.T


    # plotting
    degreeFormatter = matplotlib.ticker.FormatStrFormatter(r"%g$^\circ$")
    fig = plt.figure(figsize=(6,3))

    ax1 = fig.add_subplot(121)
    ax1.plot(time, NMP, 'b-', lw=2, label=r"$\theta_{\mathrm{NMP}}$")
    ax1.plot(time, LID, 'r-', lw=2, label=r"$\theta_{\mathrm{LID}}$")
    ax1.set_xlabel(r"time $t$ (ps)")
    ax1.set_ylabel(r"angle $\theta$")
    ax1.yaxis.set_major_formatter(degreeFormatter)
    ax1.legend(loc="best")

    ax2 = fig.add_subplot(122)
    ax2.plot(NMP, LID, 'k-', lw=3)
    ax2.set_xlabel(r"NMP-CORE angle $\theta_{\mathrm{NMP}}$")
    ax2.set_ylabel(r"LID-CORE angle $\theta_{\mathrm{LID}}$")
    ax2.xaxis.set_major_formatter(degreeFormatter)
    ax2.yaxis.set_major_formatter(degreeFormatter)
    ax2.yaxis.tick_right()
    ax2.yaxis.set_label_position("right")

    fig.subplots_adjust(left=0.12, right=0.88, bottom=0.2, wspace=0.15)

    for ext in ('svg', 'pdf', 'png'):
        fig.savefig("NMP_LID_angle_projection.{0}".format(ext))

Note that one would normally write the code more efficiently and generate the atom groups once and then pass them to a simple function to calculate the angle

def theta(A, B, C):
    """Calculate the angle between BA and BC for AtomGroups A, B, C"""
    B_center = B.centroid()
    BA = A.centroid() - B_center
    BC = C.centroid() - B_center
    theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
    return np.rad2deg(theta)

Bells and whistles

Quick data acquisition

Especially useful for interactive analysis in ipython –pylab using list comprehensions (implicit for loops):

protein = u.selectAtoms("protein")
data = np.array([(u.trajectory.time, protein.radiusOfGyration()) for ts in u.trajectory])
time, RG = data.T
plot(time, RG)

More on the trajectory iterator

One can directly jump to a frame by using “indexing syntax”:

>>> u.trajectory[50]
< Timestep 51 with unit cell dimensions array([  0.,   0.,   0.,  90.,  90.,  90.], dtype=float32) >
>>> ts.frame
51

You can also slice trajectories, e.g. if you want to start at the 10th frame and go to 10th before the end, and only use every 5th frame:

for ts in u.trajectory[9:-10:5]:
    print(ts.frame)
    ...

Note

Trajectory indexing and slicing uses 0-based indices (as in standard Python) but MDAnalysis numbers frames starting with 1 (for historical reasons and according to the practice of all MD codes).

Note

Not all trajectory readers support direct access and arbitrary slices, although many commonly ones such as DCD, XTC/TRR, and Amber NETCDF do.

See also

One can iterate through multiple trajectories in parallel with the help of itertools.izip() from the itertools module, which also provide other interesting ways to work with trajectory iterators.