POAP: A GNU parallel based multithreaded pipeline of open babel and AutoDock suite for boosted high throughput virtual screening

https://doi.org/10.1016/j.compbiolchem.2018.02.012Get rights and content

Highlights

  • POAP is a GNU Parallel based pipeline that enables optimally parallelized HTVS run by coupling Open Babel and AutoDock suite.

  • The Ligand preparation option in POAP is unique offering extensive parameter feeding and well optimized parallelization.

  • POAP provides high scalability and optimal usage of CPU cores leading to significant reduction of computational time.

  • POAP features multi receptor docking and comparative analysis enabling drug repurposing studies.

Abstract

High throughput virtual screening plays a crucial role in hit identification during the drug discovery process. With the rapid increase in the chemical libraries, virtual screening process becomes computationally challenging, thereby posing a demand for efficiently parallelized software pipelines. Here we present a GNU Parallel based pipeline-POAP that is programmed to run Open Babel and AutoDock suite under highly optimized parallelization. The ligand preparation module is a unique feature in POAP, as it offers extensive options for geometry optimization, conformer generation, parallelization and also quarantines erroneous datasets for seamless operation. POAP also features multi receptor docking that can be utilized for comparative virtual screening and drug repurposing studies. As demonstrated using different structural datasets, POAP proves to be an efficient pipeline that enables high scalability, seamless operability, dynamic file handling and optimal utilization of CPU’s for computationally demanding tasks. POAP is distributed freely under GNU GPL license and can be downloaded at https://github.com/inpacdb/POAP.

Introduction

Drug discovery and development has undergone phenomenal changes over the years. Computer aided drug designing strategies like molecular modelling and structure-based virtual screening has been the reason for this progressive change. Moreover, discovery of new therapeutic moieties in a swift and cost effective manner is the need of the hour, and can only be achieved by implementing robust computational approaches. Continual advancement in the application of computational aspects to biological and chemical space has extremely influenced modern drug development chain (Rahman et al., 2012). Computational tools are widely applied for predicting hit molecules against the target of interest, and many such predictions were proven to be highly accurate at experimental validation (Kuhn et al., 2016; Sliwoski et al., 2014; Xia, 2017). In the current scenario, structure based drug design involving structural refinement, molecular docking and virtual screening has become an indispensable part of drug discovery process (Ferreira et al., 2015; Kalyani, 2013; Śledź and Caflisch, 2017).

The “Open source” concept has revolutionized the software industry worldwide. A ten point standards were announced by Open source initiative to define the term “open source” (Årdal and Røttingen, 2012). Among these ten points, three are considered to be significant: access to source code, free redistribution and creation of derived works (Årdal and Røttingen, 2012). Open source based drug design software like AutoDock, Open Babel etc., have played a major role in accelerating the drug discovery process (Umashankar and Gurunathan, 2015) and also well abide to the key standards of open source initiative. However, complete potential of these opensource tools in High Throughput drug discovery can only be unleashed by means of massive parallelization, and proper placement in a computational pipeline. In general, efficiently parallelized and guided workflows are available only in expensive commercially licensed software. Thus, an open source based Virtual screening pipeline which is parallelized efficiently, will attract many scientist with limited resources to pursue virtual screening of using large set of chemical libraries in an efficient way.

Recent benchmarking studies on free and commercial docking tools have shown AutoDock Vina to be an optimal performer in identifying the best ligand bound pose (Wang et al., 2016).Virtual screening of ligands using AutoDock and AutoDock Vina are being extensively used by various people for the lead identification of the target proteins. Many useful tools like PyRx, raccoon, DOVIS, VSdocker, AUDocker LE and Pymol plugins are available for performing virtual screeing studies with AutoDock and AutoDock Vina (Chen, 2015, Chen, 2015; Lill and Danielson, 2011; Prakhov et al., 2010; Sandeep et al., 2011; Zhang et al., 2008). However, there is a need for pipelines that efficiently utilize the simple yet powerful GNU Parallel for parallelizing the complete virtual screening workflow, right from ligand preparation to post docking analysis. Especially, there is dearth of open source pipelines which can handle ligand preparation process in a parallelized manner. Inverse docking and multiple protein docking protocols has been proven to be powerful methods to assign the targets for the ligand of interest (Li et al., 2006; Medina-Franco et al., 2013). These protocols demand a well parallelized pipelines capable of handling huge datasets.

Though, there have been highly appreciable attempts to develop these sorts of open source pipelines, there are concerns with installation, configuration and guided workflow. Moreover, many of such pipelines have not attempted to completely utilize the highly efficient GNU Parallel tool to parallelize the virtual screening process, including ligand preparation to post docking analysis.

Hence, in this study, it is attempted to develop a parallelized virtual screening pipeline: Parallelized Open Babel & AutoDock suite Pipeline (POAP) which integrates the popular tools like Open Babel, AutoDock, AutoDock Vina and AutoDockZN, in an easily configurable bash shell based text interface. POAP offers modules for ligand preparation, Single receptor Virtual screening, multiple receptor Virtual screening and consensus scoring. All these modules are engineered to run in a GNU Parallel based multi CPU environment. In POAP, a well optimized dynamic file handling is also implemented, thereby, enabling optimal RAM usage, quarantining of erroneous ligand datasets facilitating unperturbed operation of the workflow, and structured accessibility of input, output and intermediary files. The developed pipeline demonstrates the effective usage of GNU Parallel tool to be implemented in the development of complete virtual screening workflow.

Section snippets

Materials and methods

POAP was developed using bash programming language integrating the most popular tools: Open Babel-2.4.0 for ligand optimization and AutoDock-4.2.6, AutoDock Vina-1.1.2, AutoDockZn for virtual screening, scripts from MGLTOOLS-1.5.7 (http://mgltools.scripps.edu/). The parallelized executions of the jobs were achieved by utilizing the GNU Parallel tool.

Speedup performance of ligand preparation module in POAP

The efficiency of ligand preparation module was validated using four (N = 4) different ligand databases: FDA approved drugs from DrugBank, Myriascreen-II, NDL and EFD. The validation runs featured seamless operability, leading to generation of ligand datasets with optimal geometry favourable for virtual screening. Moreover, among the ligands prepared, Open Babel showed error messages for few ligands during 2D to 3D conversion, conformer generation, minimization and pdbqt conversion. POAP

Conclusion

POAP is distinct of its kind in utilizing the potential of GNU Parallel for parallelization of ligand preparation by Open Babel and virtual screening using AutoDock suite. It features a unique and important function of quarantining the erroneous ligands which is essential for unperturbed parallelized run. The efficiency of POAP modules in handling different datasets has been well demonstrated in this study. POAP is distributed freely under GNU GPL license with extensive manual and supporting

Conflict of interest

The authors declare that there are no conflicts of interest.

Acknowledgements

We would like to acknowledge Department of Bio-Technology (DBT), Ministry of Science and Technology, Government of India, for providing financial assistance through DBT-JRF Fellowship [DBT/2015/VRF/363] to Samdani for carrying out this work. The authors also thank DBT Rapid Grant for Young Investigator (RGYI) scheme [BT/PR6476/GBD/27/496/2013, 05/09/2013] for the hardware support.

References (45)

  • O. Defert et al.

    Rho kinase inhibitors: a patent review (2014–2016)

    Expert Opin. Ther. Pat.

    (2017)
  • C. Empereur-Mot et al.

    Screening explorer-an interactive tool for the analysis of screening results

    J. Chem. Inf. Model.

    (2016)
  • L.G. Ferreira et al.

    Molecular docking and structure-based drug design strategies

    Molecules (Basel, Switzerland)

    (2015)
  • C. Ji et al.

    Designed small-molecule inhibitors of the anthranilyl-CoA synthetase PqsA block quinolone biosynthesis in pseudomonas aeruginosa

    ACS Chem. Biol.

    (2016)
  • G. Kalyani

    A review on drug designing, methods, its applications and prospects

    Int. J. Pharm. Res. Dev.

    (2013)
  • R.K. Karmani et al.

    Amdahl’s law

  • B. Kuhn et al.

    A real-world perspective on molecular design

    J. Med. Chem.

    (2016)
  • S. Lätti et al.

    Rocker: open source, easy-to-use tool for AUC and enrichment calculations and ROC visualization

    J. Cheminf.

    (2016)
  • B. Lesic et al.

    Inhibitors of pathogen intercellular signals as selective anti-infective compounds

    PLoS Pathog.

    (2007)
  • H. Li et al.

    TarFisDock: a web server for identifying drug targets with docking approach

    Nucleic Acids Res.

    (2006)
  • M.A. Lill et al.

    Computer-aided drug design platform using PyMOL

    J. Comput. Aided Mol. Des.

    (2011)
  • G.M. Morris et al.

    AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility

    J. Comput. Chem.

    (2009)
  • Cited by (0)

    View full text