New in version 0.2.5.
Basic reading, manipulating and writing works but can still be a bit awkward. See documentation for ITP.
Warning
The code is still experimental.
Limitations
An itp file can be loaded and individual sections accessed (print shows a representation of what it would look like written to a file):
>>> itp = gromacs.fileformats.ITP("benzylhyd5.itp")
>>> print itp.header
>>> ...
>>> print itp.header.moleculetype
[ moleculetype ]
; Name nrexcl
5FH 3
Each section is an object that contains the parsed data in a data attribute:
>>> print itp.header.moleculetype.data
odict.odict([('name', '5FH'), ('nrexcl', 3)])
>>> print itp.header.moleculetype.data['name']
5FH
The moleculetypes section contains the important subsections that make up the topology:
>>> print itp.header.moleculetype.sections.keys()
['atoms', 'bonds', 'angles', 'dihedrals', 'pairs']
They can be accessed in the same manner. For instance, counting all atom records:
>>> print len(itp.header.moleculetype.atoms)
24
For the atoms, the data structure is a numpy.rec.recarray.
Data can be modified and then finally a new itp file can be written with the ITP.write() method.
>>> itp.write("modified.itp")
The ITP class always represents one itp file. It contains a tree of parsed subsections that correspond to the subsections in the itp file. The way to drill down is to access the attribute ITP.sections, where the subsections are stored. ITP.sections acts like a dictionary. Each subsection contains again a ITPsection.sections dictionary containing its own subsections. To find the [ atoms ] data:
itp.sections['header'].sections['moleculetype'].sections['atoms'].data
For convenience, one can also use the keys in the sections dictionary directly as attributes. In this way finding the [ atoms ] data is even simpler:
itp.header.moleculetype.atoms.data
For more details on the organization of the data structure see Classes corresponding to ITP sections.
Class that represents a Gromacs topology itp file.
The file is processed by a Preprocessor according to directves in the file and variables defined in the constructor keyword arguments. Thus, the class ITP represents one particularly state of the itp file with any preprocessor constructs excluded. Only a limited set of directives is understood and a SyntaxError is raised in cases where the preprocessor does not know how to handle the file.
The file is parsed into a hierarchy of sections (and subsections), which are accessible in the attribute ITP.sections.
Data of (sub)sections is stored in data, e.g. the [atoms] section
itp.sections['header'].sections['moleculetype'].sections['atoms'].data
contains the list of atoms for molecule
itp.sections['header'].sections['moleculetype'].data['name']
Data attributes are typically (ordered) dictionaries or NumPy recarrays.
The top-level section is called “header” and simply contains all comment lines before the first real parsed section.
Warning
Not all section types are implemented. There can only be a single type of section at each level of the hierarchy.
Only a single [moleculetype] is currently supported.
Subsequent [dihedrals] sections are merged but if another section intersperses two [dihedral] sections then the first one is lost.
Comments are stripped except at the beginning of the file and at the end of data lines.
New in version 0.2.5.
Changed in version 0.3.1: Access to sections by attribute access is possible.
Initialize ITP structure.
Arguments : |
|
---|
See also
preprocessor describes details about the implementation of preprocessing of the itp file, including the (limited) syntax understood by the preprocesser.
Check if file makes use of any preprocessor constructs.
The test is done by running the file through the Preprocessor (while stripping all empty and lines starting with a comment character. This is compared to the original file, stripped in the same manner. If the two stripped files differ from each other then the preprocessor altered the file and preprocessor directives must have been involved and this function returns True.
Simple recursive descent parsing of ITP.
stream must have a next() and an unread() method. A call must supply the next line of input or raise EOFError or StopIteration.
Preprocess, read and parse itp file filename.
Any keywords in defines are use to modify the default preprocessor variables (see gromacs.fileformats.preprocessor.Preprocessor.parse() for details). Setting preprocess = False skips the preprocessing step.
Generator that applies func to each section.
Traverses all sections depth first. Applies
User supplied function that uses both the string name and the ITPsection instance and returns a result. Note that any lists and lists of lists will be flattened so it mostly makes sense to return a string or similar.
at each level and iterates over the individual results. With flat = True the resulting list is returned flat (default), otherwise the list represents the tree structure of the itp file.
By default, func is str(section). Thus, the whole itp file can be printed out with
print "\n".join(itp.walk_sections())
(See write() for writing out the ITP.)
Write ITP file to filename
Class ITP makes use of the following classes to parse and represent a topology as stored in an itp file.
Class to parse and store ITP data.
data contains the data of the section
sections holds the sub-parsers (including their own data; this odict mirrors the parse tree.
Note that sections at the same level are merged.
parsers contains a dict of the all the parser classes (keyed by section name) that are included in this section; typically overriden.
New in version 0.2.5.
Changed in version 0.3.1: Access to sections by attribute access is possible.
Generator that applies func to each section and subsections.
Traverses all sections depth first. Applies
User supplied function that uses both the string name and the ITPsection instance and returns a result. Note that any lists and lists of lists will be flattened so it mostly makes sense to return a string or similar.
at each level and iterates over the individual results. With flat = True the resulting list is returned flat (default), otherwise the list represents the tree structure of the itp file.
By default, func is str(section).
Class to represent most ITP sections.
New in version 0.2.5.
atom data as a numpy.rec.array
The data inside the array can be changed but not appended.
Return a string of the section data in ITP format.
Data entries containing None are stripped from the output. In accordance with Gromacs ITP parsing rules, data columns can only be ommitted from the right.
data is atom data, stored as numpy.rec.arry
Wrapper around anything with a next() method, to provide an unread().
Provide the next method (or any other function that returns one line of input) as the argument getline.
Implements a one-line buffer, unread() can only move the stream one line back (repeatedly calling unread() does not do anything).
New in version 0.2.5.
Generator for flattening a nested list l.
http://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists-in-python
New in version 0.2.5.
The ITP file is parsed with a simple recursive descending depth-first algorithm. Classes contain a dictionary of sub-section parsers.
The section hierarchy is
Moleculetype (Note: currently only a single moleculetype allowed)
The customary comments section before the first section.
Example format:
; itp file for Benzene
; example "header"
New in version 0.2.5.
ITP [atomtypes] section.
Example format:
[ atomtypes ]
;name bond_type mass charge ptype sigma epsilon
c3 c3 0.0000 0.0000 A 3.39967e-01 4.57730e-01
hc hc 0.0000 0.0000 A 2.64953e-01 6.56888e-02
New in version 0.2.5.
ITP [ moleculetype ] section
Example format:
[ moleculetype ]
; Name nrexcl
5FH 3
New in version 0.2.5.
Set the molecule name to molname.
ITP [ atoms ] section
Example format:
[ atoms ]
; atomnr atomtype resnr resname atomname chargegrp charge mass
1 opls_145 1 5FH C7 7 -0.115 12.01100 ; CA # Benzene C - 12 site JACS,112,4768-90. Use #145B for biphenyl
2 opls_146 1 5FH H71 7 0.115 1.00800 ; HA # Benzene H - 12 site.
The data itself are stored in data, a numpy.rec.array. data is a managed attribute but the values inside can be changed. Fields in the array have a maximum size (in particular, comments are cut off after 128 characters).
New in version 0.2.5.
numpy.dtype columns for the data
output format (space separated), same ordering as columns
Changes the resname to name for all atom entries
ITP [ bonds ] section
Example format:
[ bonds ]
; ai aj funct r k
1 2 1 ; CA-HA # PHE, etc.
1 3 1 ; CA-CA # TRP,TYR,PHE
New in version 0.2.5.
numpy.dtype columns for the data
output format (space separated), same ordering as columns
ITP [ angles ] section
Example format:
[ angles ]
; ai aj ak funct theta cth
2 1 3 1 ; HA-CA-CA #
2 1 11 1 ; HA-CA-CA #
3 1 11 1 ; CA-CA-CA # PHE(OL)
New in version 0.2.5.
numpy.dtype columns for the data
output format (space separated), same ordering as columns
ITP [ dihedrals ] section
Example format:
[ dihedrals ]
; ai aj ak al funct C0 ... C5
2 1 3 4 3 ; HA-CA-CA-HA # aromatic ring (X-CA-CA-X generic proper dihedral)
2 1 3 5 3 ; HA-CA-CA-CA # aromatic ring (X-CA-CA-X generic proper dihedral)
Note
Multiple adjacent dihedral sections are currently merged into a single dihedral section. This means that if you separated proper and improper dihedrals into two different sections then they will now appear one after another under a single [ dihedrals ] section. If two dihedral sections are separated by another section then the last one overwrites the previous one.
New in version 0.2.5.
numpy.dtype columns for the data
output format (space separated), same ordering as columns
ITP [ pairs ] section
Example format:
[ pairs ]
; ai aj funct
1 6 1
1 7 1
New in version 0.2.5.
Not implemented, just ignores everything