Abstract
The term pharmacophore is used to define the important features of one or more molecules having the same biological activity. Pharmacophores are selected based on several common features, such as the type of functional groups present, the distance between each atom or group of atoms and the angle between such groups or an individual atom. In this paper, we present the design and implementation of a pharmacophore searching tool, Pharmadoop, using the Hadoop framework. Due to its Hadoop implementation, Pharmadoop is a faster approach as compared to the existing standalone pharmacophore search tools. It utilizes the MapReduce algorithm to support the comparison of millions of conformers in a short time span. We further demonstrated and compared the utility of Pharmadoop on ten distinct chemical datasets of ligand molecules by running common substructure searching job on standalone and multi-system Hadoop platforms. These results were further used to perform pharmacophore searching applications on standalone and multi-node Hadoop distributions. The performance, speed and accuracy of the tool were evaluated through time-scale analysis and receiver operating curve. The Pharmadoop tool can be accessed at http://bioserver.iiita.ac.in/Pharmadoop/.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13721-017-0161-x/MediaObjects/13721_2017_161_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13721-017-0161-x/MediaObjects/13721_2017_161_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13721-017-0161-x/MediaObjects/13721_2017_161_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13721-017-0161-x/MediaObjects/13721_2017_161_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13721-017-0161-x/MediaObjects/13721_2017_161_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13721-017-0161-x/MediaObjects/13721_2017_161_Fig6_HTML.gif)
Similar content being viewed by others
References
Aier I, Varadwaj PK, Raj U (2016) Structural insights into conformational stability of both wild-type and mutant EZH2 receptor. Sci Rep 6:34984
Dixon SL, Smondyrev AM, Rao SN (2006) PHASE: a novel approach to pharmacophore modeling and 3D database searching. Chem Biol Drug Des 67(5):370–372
Dror O, Shulman-Peleg A, Nussinov R, Wolfson HJ (2006) Predicting molecular interactions in silico: I. an updated guide to pharmacophore identification and its applications to drug design. In: Frontiers in Medicinal Chemistry (Vol. 551, No. 584, pp 551–584). Bentham Science Publishers
Guha R, Van Drie J (2008) Pharmacophore representation and searching. CDK News
Gund P (2000) Evolution of the pharmacophore concept in pharmaceutical research. pharmacophore perception, development and use in drug design. pp 1–11
Jauffret P, Hanser T, Tonnelier C, Kaufmann G (1990a) Machine learning of generic reactions: 1. Scope of the project; the GRAMS program. Tetrahedron Comput Methodol 3(6):323–333
Jauffret P, Tonnelier C, Hanser T, Kaufmann G, Wolff R (1990b) Machine learning of generic reactions: 2. toward an advanced computer representation of chemical reactions. Tetrahedron Comput Methodol 3(6):335–349
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J (2015) PubChem substance and compound databases. Nucleic Acids Res 44(D1):D1202–D1213
Koes DR, Camacho CJ (2012) ZINCPharmer: pharmacophore search of the ZINC database. Nucleic Acids Research, 40(Web Server issue). pp W409–W414
Rai S, Raj U, Tichkule S, Kumar H, Mishra S, Sharma N, Buddha R, Raghav D, Varadwaj PK (2016) Recent trends in in-silico drug discovery. Int J Comput Biol 5(1):54–76
Raj U, Varadwaj PK (2016) Flavonoids as multi-target inhibitors for proteins associated with Ebola virus: in silico discovery using virtual screening and molecular docking studies. Interdiscip Sci 8(2):132–141
Raj U, Kumar H, Gupta S, Varadwaj PK (2016) Exploring dual inhibitors for STAT1 and STAT5 receptors utilizing virtual screening and dynamics simulation validation. J Biomol Struct Dyn 34(10):2115–2129
Raj U, Sharma AK, Aier I, Varadwaj PK (2017) In silico characterization of hypothetical proteins obtained from Mycobacterium tuberculosis H37Rv. Netw Model Anal Health Inform Bioinform 6(1):5
Raymond JW, Willett P (2002) Maximum common subgraph isomorphism algorithms for the matching of chemical structures. J Comput Aided Mol Des 16(7):521–533
Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST). IEEE. pp 1–10
Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The Chemistry Development Kit (CDK): an open-source Java library for chemo-and bioinformatics. J Chem Inf Comput Sci 43(2):493–500
Tabhane S, Fadnavis RA (2015) Large data computing using Clustering algorithms based on Hadoop. Int J Eng Res Gen Sci 3(2):1056–1063
Taylor RC (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinform 11(Suppl 12):S1
Tonnelier C, Jauffret P, Hanser T, Kaufmann G (1990) Machine learning of generic reactions: 3. an efficient algorithm for maximal common substructure determination. Tetrahedron Comput Methodol 3(6):351–358
Van Drie JH, Weininger D, Martin YC (1989) ALADDIN: an integrated tool for computer-assisted molecular design and pharmacophore recognition from geometric, steric, and substructure searching of three-dimensional molecular structures. J Comput Aided Mol Des 3(3):225–251
Wermuth CG (2006) Pharmacophores: historical perspective and viewpoint from a medicinal chemist. Methods Princ Med Chem 32:3
Wermuth CG, Ganellin CR, Lindberg P, Mitscher LA (1998) Glossary of terms used in medicinal chemistry (IUPAC Recommendations 1998). Pure Appl Chem 70(5):1129–1143
White T (2012) Hadoop: the definitive guide. Sebastopol, O’Reilly Media, Inc.
Yang S (2010) Pharmacophore modeling and applications in drug discovery: challenges and recent advances. Drug Discov Today 15(11):444–450
Zhu W, Zeng N, Wang N (2010) Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS® implementations. NESUG proceedings: health care and life sciences, Baltimore. pp 1–9
Zikopoulos P, Eaton C (2011) Understanding big data: analytics for enterprise class Hadoop and streaming data. McGraw-Hill Osborne Media
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Semwal, R., Aier, I., Raj, U. et al. Pharmadoop: a tool for pharmacophore searching using Hadoop framework. Netw Model Anal Health Inform Bioinforma 6, 20 (2017). https://doi.org/10.1007/s13721-017-0161-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-017-0161-x