.. -*- encoding: utf-8 -*- ======== Basics ======== We discuss the fundamental objects in MDAnalysis, the :ref:`universe-and-atomgroup`, and the facilities for :ref:`selections` of atoms. These selections themselves are again an :class:`~MDAnalysis.core.AtomGroup.AtomGroup`. .. _universe-and-atomgroup: Universe and AtomGroup ====================== MDAnalysis is **object oriented**. Molecular systems consist of :class:`~MDAnalysis.core.AtomGroup.Atom` objects (instances of the class :class:`MDAnalysis.core.AtomGroup.Atom`), which are grouped in :class:`~MDAnalysis.core.AtomGroup.AtomGroup` instances. You build the :class:`~MDAnalysis.core.AtomGroup.AtomGroup` of your system by loading a **topology** (list of atoms and possibly their connectivity) together with a **trajectory** (coordinate information) into the central data structure, the :class:`~MDAnalysis.core.AtomGroup.Universe` object:: >>> u = MDAnalysis.Universe(PSF, DCD) >>> print(u) The atoms are stored in the attribute :attr:`~MDAnalysis.core.AtomGroup.Universe.atoms` of the :class:`MDAnalysis.core.AtomGroup.Universe`:: >>> print(u.atoms) >>> list(u.atoms[:5]) [< Atom 1: name 'N' of type '56' of resname 'MET', resid 1 and segid '4AKE'>, < Atom 2: name 'HT1' of type '2' of resname 'MET', resid 1 and segid '4AKE'>, < Atom 3: name 'HT2' of type '2' of resname 'MET', resid 1 and segid '4AKE'>, < Atom 4: name 'HT3' of type '2' of resname 'MET', resid 1 and segid '4AKE'>, < Atom 5: name 'CA' of type '22' of resname 'MET', resid 1 and segid '4AKE'>] Any :class:`~MDAnalysis.core.AtomGroup.AtomGroup` knows the residues that the atoms belong to via the attribute :attr:`~MDAnalysis.core.AtomGroup.AtomGroup.residues`, which produces a :class:`~MDAnalysis.core.AtomGroup.ResidueGroup`. A :class:`~MDAnalysis.core.AtomGroup.ResidueGroup` acts like a list of :class:`~MDAnalysis.core.AtomGroup.Residue` objects:: >>> u.atoms[100:130].residues , , ]> Larger organizational units are :class:`~MDAnalysis.core.AtomGroup.Segment` instances, for example one protein or all the solvent molecules or simply the whole system. :class:`~MDAnalysis.core.AtomGroup.Atom`, :class:`~MDAnalysis.core.AtomGroup.AtomGroup`, :class:`~MDAnalysis.core.AtomGroup.Residue`, and :class:`~MDAnalysis.core.AtomGroup.ResidueGroup` have an attribute :attr:`~MDAnalysis.core.AtomGroup.AtomGroup.segments` that will list the segment IDs ("segids") as a :class:`~MDAnalysis.core.AtomGroup.SegmentGroup`:: >>> u.atoms.segments ]> The converse is also true: each "higher" level in the hierarchy also know about the :class:`~MDAnalysis.core.AtomGroup.Residue` and :class:`~MDAnalysis.core.AtomGroup.Atom` instances it contains. For example, to list the atoms of the :class:`~MDAnalysis.core.AtomGroup.ResidueGroup` we had before:: >>> r = u.atoms[100:130].residues >>> r.atoms Exercise 1 ---------- 1. What residue ("resname") does the last atom belong to in the above example? :: >>> r = u.atoms[100:130].residues >>> r.atoms[-1] < Atom 136: name 'O' of type '70' of resname 'ALA', resid 8 and segid '4AKE'> 2. Why does the expression :: len(u.atoms[100:130]) == len(u.atoms[100:130].residues.atoms) return ``False``? Because the complete residues contain more atoms than the arbitrary slice of atoms. 3. How many residues are in the :class:`~MDAnalysis.core.AtomGroup.AtomGroup.Universe` ``u``? :: >>> len(u.atoms.residues) >>> u.atoms.numberOfResidues() 214 How do you get a list of the residue names (such as ``["Ala", "Gly", "Gly", "Asp", ...]``) and residue numbers ("resid") for atoms 1000 to 1300? And as a list of tuples ``(resname, resid)`` (Hint: :func:`zip`)?:: >>> resnames = u.atoms[999:1300].resnames() >>> resids = u.atoms[999:1300].resids() >>> zip(resnames, resids) How do you obtain the resid and the resname for the 100th residue? (Hint: investigate the :class:`~MDAnalysis.core.AtomGroup.Residue` object interactively with :kbd:`TAB` completion) :: >>> r100 = u.atoms.residues[99] >>> print(r100.id, r100.name) 100 GLY 4. How many segments are there? :: >>> len(u.segments) >>> len(u.atoms.segments) >>> u.atoms.numberOfSegments() 1 What is the segment identifier of the first :class:`~MDAnalysis.core.AtomGroup.Segment`? :: >>> s1 = u.segments[0] >>> s1.id '4AKE' .. SeeAlso:: Methods of :class:`~MDAnalysis.core.AtomGroup.AtomGroup`, :class:`~MDAnalysis.core.AtomGroup.ResidueGroup`, and :class:`~MDAnalysis.core.AtomGroup.SegmentGroup` * :meth:`~MDAnalysis.core.AtomGroup.AtomGroup.numberOfResidues` and :meth:`~MDAnalysis.core.AtomGroup.AtomGroup.numberOfAtoms` * :meth:`~MDAnalysis.core.AtomGroup.AtomGroup.resids` * :meth:`~MDAnalysis.core.AtomGroup.AtomGroup.resnames` .. _selections: Selections ========== .. TODO: named selections MDAnalysis comes with a fairly complete `atom selection`_ facility. Primarily, one uses the method :meth:`~MDAnalysis.core.AtomGroup.Universe.selectAtoms` of a :class:`~MDAnalysis.core.AtomGroup.Universe`:: >>> CA = u.selectAtoms("protein and name CA") >>> CA >>> but really any :class:`~MDAnalysis.core.AtomGroup.AtomGroup` has a :meth:`~MDAnalysis.core.AtomGroup.AtomGroup.selectAtoms` method:: >>> acidic = CA.selectAtoms("resname ASP or resname GLU") >>> acidic >>> >>> acidic.residues , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]> .. SeeAlso:: All the `selection keywords`_ are described in the documentation. Selections can be combined with boolean expression and it is also possible to select by geometric criteria, e.g. with the :samp:`around {distance} {selection}` keyword:: u.selectAtoms("((resname ASP or resname GLU) and not (backbone or name CB or name CG)) \ and around 4.0 ((resname LYS or resname ARG) \ and not (backbone or name CB or name CG))").residues This selection will find atoms potentially involved in salt bridges between acidic and basic residues. Exercises 2 ----------- 1. Select the range of resids 100 to 200 ("100-200") with a selection. Compare the result to what you get by slicing the :attr:`u.atoms.residues` appropriately. Which approach would you prefer to use in a analysis script? Solution:: >>> u.selectAtoms("resid 100-200") Compare to the slicing solution (doing an element-wise comparison, i.e. residue by residue in each :func:`list`):: >>> list(u.selectAtoms("resid 100-200").residues) == list(u.atoms.residues[99:200]) If one wants to get specific residues in scripts one typically uses selections instead of slicing because the index in the slice might not correspond to the actual residue ids (minus 1): If a number of residues (e.g. 150-160) are missing from the structure then the selection will simply give you residues 100-149 and 151-200 but the slice 99:200 would give you residues 100-149 and *161-209*. 2. Select all residues that do not contain a :math:`\mathrm{C}_\beta` ("CB") atom. How many are there? What residue names did you find? Solution:: >>> sel = u.selectAtoms("(byres name CA) and not (byres name CB)").residues >>> len(sel) 20 These are all Glycines, as can be seen by comparing the residue groups element-wise:: >>> glycines = u.selectAtoms("resname GLY") >>> list(sel) == list(glycines.residues) True .. _atom selection: http://pythonhosted.org/MDAnalysis/documentation_pages/selections.html .. _selection keywords: http://pythonhosted.org/MDAnalysis/documentation_pages/selections.html#selection-keywords