Eucb: A C++ program for molecular dynamics trajectory analysis

https://doi.org/10.1016/j.cpc.2010.11.032Get rights and content

Abstract

Eucb is a standalone program for geometrical analysis of molecular dynamics trajectories of protein systems. The program is written in GNU C++ and it can be installed in any operating system running a C++ compiler. The program performs its analytical tasks based on user supplied keywords. The source code is freely available from http://stavrakoudis.econ.uoi.gr/eucb under LGPL 3 license.

Program summary

Program title: Eucb

Catalogue identifier: AEIC_v1_0

Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEIC_v1_0.html

Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland

Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html

No. of lines in distributed program, including test data, etc.: 31 169

No. of bytes in distributed program, including test data, etc.: 297 364

Distribution format: tar.gz

Programming language: GNU C++

Computer: The tool is designed and tested on GNU/Linux systems

Operating system: Unix/Linux systems

RAM: 2 MB

Supplementary material: Sample data files are available

Classification: 3

Nature of problem: Analysis of molecular dynamics trajectories.

Solution method: The program finds all possible interactions according to input files and the user instructions. Then it reads all the trajectory frames and finds those frames in which these interactions occur, under certain geometrical criteria. This is a blind search, without a priori knowledge if a certain interaction occurs or not. The program exports time series of these quantities (distance, angles, etc.) and appropriate descriptive statistics.

Running time: Depends on the input data and the required options.

Introduction

Analyzing trajectory data is very often the bottleneck in obtaining biological information from computer simulation molecular dynamics. Here we present a new software tool called eucb (Euclidean computational biology, from the name of the famous Greek mathematician Euclid of Alexandria, known as the “Father of Geometry”). It is written in GNU C++ and can perform a lot of geometrical type calculations. The eucb program does not depend on any external mathematical or graphics library. The program was developed in GNU/Linux, but surely it can be installed in any operating system running the GNU C++ compiler. The distribution comes in a compressed tar ball and it must be compiled before usage. Nevertheless, the only outcome of the compilation is a binary file without the need for additional resource files and hence, the user can execute the program from any directory of the system.

There is a continuing interest in trajectory analysis and several software tools have appeared in the literature during last years [1], [14], [13], [15]. The eucb program has many more options in comparison to other similar programs. It performs the required calculations based on user instructions given by command line options. The program works with NAMD/CHARMM [2] compatible trajectories and can be used for several type calculations, such as H-bonds, beta-turns, noe type interactions, finding stacking residues, weak interactions like NH/aromatic hydrogen bonds, hydrophobic clusters, water bridged hydrogen bonds, etc. The final outcome of such calculations is a series of files, including time series files, files with descriptive statistics, moving averages, histogram files, etc. The program accepts three files as input:

  • 1.

    A .psf file, which describes the molecular structure of the system in CHARMM/XPLOR format.

  • 2.

    A .dcd file, which holds in binary format the molecular dynamics trajectory.

  • 3.

    A .pdb file, which holds the reference coordinates of the system in PDB format.

The most striking features of the program are:

  • 1.

    It calculates any type of geometry, distance, angle or dihedral from the MD trajectory and performs all the standard calculations (RMSF, RMSD, standard torsions, etc.).

  • 2.

    It scans MD trajectories for specific type of interactions, such as hydrogen bonds, hydrophobic interactions, salt bridges, stacking side chains, etc., using only high level keywords, such as hbonds, salt2, salt3, stack, etc.

  • 3.

    It identifies hydrophobic clusters through the MD trajectory.

  • 4.

    It calculates the instant water coordination number [5]. Thus, it identifies isolated water molecules from the bulk solvent.

  • 5.

    It calculates water bridged hydrogen bond interactions.

  • 6.

    It can calculate NMR related quantities from the MD trajectory, like noe distances and J coupling constants.

  • 7.

    It export histogram and descriptive statistics of the calculated variables.

The program is highly configurable and all parameters can be customized from the user. Thus, it is expected to facilitate the biological implementation of simulation data. The basic advantage of eucb is its high level instructions. The user can ask human type questions like what are the close contacts between chains A and B in a protein complex? and can implement this query in one single command, without having to deal with calculations of the specific distances that characterize these interactions. Beyond calculating time series, the program also exports smoothed time series, descriptive statistics and histograms that are suitable for graphs.

Section snippets

Installation

This is a short quick start guide for the installation of the program and contains some small examples that explain the basic features of if. The user who wants to work with eucb can consult the online manual of the program available at the URL http://stavrakoudis.econ.uoi.gr/eucb.

The program is distributed under tar.gz compressed format and it can be downloaded from http://stavrakoudis.econ.uoi.gr/eucb. The source code release is distributed under the LGPL3 license. The program is written in

Usage

The program requires a series of input files to work properly and it performs the required computations dictated by the command line options. The program produces a series of output files, that are time series files accompanied by statistics, moving averages, histograms and log files. For example the command: computes the RMSD of the trajectory protein.dcd frames after fitting the structures on the structure of the protein.pdb file. Non-hydrogen (heavy) atoms of segments A, C are taken into

Options

The eucb program has a variety of command line options, that are divided into general options and computing options. The general options are used in order to define some flags of the program and the computing options are used to compute some quantities and to produce the required time series files.

Example runs

We are taking a previous study [6] as a test case to explore the capabilities of the eucb software. The accompanied files can be downloaded from the relevant eucb site. The target here is to investigate the hydrophobic interactions between the protein chains A and C. This is not at all a trivial task with the available software tools and involves considerable user effort. With eucb this can be simplified as a single command. The program interprets high level instructions to all necessary

Conclusions

Practitioners of MD simulations can benefit from the use of eucb software both by speeding up common type calculations and by performing more elegant queries to MD trajectories. High level instructions have been introduced to facilitate the analysis of MD trajectories from biological perspective. The program is highly configurable, instructions are programmable via shell scripts and new features can be easily added with minimum effort in C++ coding. Bug fixes, new options and code improvements

References (17)

  • P.M. Petrone et al.

    MHC-peptide binding is assisted by bound water molecules

    Journal of Molecular Biology

    (2004)
  • V.A. Tatsis et al.

    Insights into the structure of the PmrD protein with molecular dynamics simulations

    International Journal of Biological Macromolecules

    (2009)
  • N.M. Glykos

    Carma: A molecular dynamics analysis program

    Journal of Computational Chemistry

    (2006)
  • J.C. Phillips et al.

    Scalable molecular dynamics with NAMD

    Journal of Computational Chemistry

    (2005)
  • G. Tóth et al.

    Stabilization of local structures by π-CH and aromatic–backbone amide interactions involving prolyl and aromatic residues

    Protein Engineering

    (2001)
  • W. Kabsch

    A solution of the best rotation to relate two sets of vectors

    Acta Crystallographica

    (1976)
  • A. Stavrakoudis

    A disulfide linked model of the complement protein C8γ complexed with C8α indel peptide

    Journal of Molecular Modeling

    (2009)
  • A. Stavrakoudis et al.

    Molecular dynamics simulation of antimicrobial peptide arenicin-2: β-hairpin stabilization by noncovalent interactions

    Biopolymers

    (2009)
There are more references available in the full text version of this article.

Cited by (19)

  • Effect and mechanism analysis of different linkers on efficient catalysis of subunit-fused nitrile hydratase

    2021, International Journal of Biological Macromolecules
    Citation Excerpt :

    These parameters were used for the first time in SMD simulation for PtNHase in 2008 [39]. Analysis was made using VMD 1.9.3 [40], homemade scripts and Eucb tool [41]. To improve the NHase stability and grasp insight into the gene fusion effect on NHase, the subunit-fused NHase with different linkers were constructed.

  • Evaluation of the steric impact of flavin adenine dinucleotide in Drosophila melanogaster cryptochrome function

    2014, Biochemical and Biophysical Research Communications
    Citation Excerpt :

    RMSF indexes are reported as both plots and structure coloring and thickening. All obtained trajectories first went through visual inspection with Chimera [19], cleaning and analysis with Carma [20] and EUCB [21]. The first step consisted in water, ion and co-factor removal and system recentering with Carma, obtaining trajectory files containing only the protein structure and a constant fitted center of mass.

View all citing articles on Scopus

This paper and its associated computer program are available via the Computer Physics Communications homepage on ScienceDirect (http://www.sciencedirect.com/science/journal/00104655).

View full text