A Similarity-Based Grouping Method for Molecular Docking in Distributed System

Zhang, Ruisheng; Liu, Guangcai; Hu, Rongjing; Wei, Jiaxuan; Li, Juan

doi:10.1007/978-3-642-53914-5_47

Ruisheng Zhang²⁵,
Guangcai Liu²⁵,
Rongjing Hu²⁵,
Jiaxuan Wei²⁵ &
…
Juan Li²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8346))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2364 Accesses
3 Citations

Abstract

Molecular docking is one main technique in Virtual Screening. During a molecular docking process, the molecule docking time presents serious diversity because of different chemical structures. The time diversity can cause certain nodes to overload, thereby reducing the data processing ability of the whole distributed molecular docking system. Therefore, a reasonable and efficient data grouping strategy is essential in the molecular docking system. In this paper, molecular structural similarity is researched in depth, and a similarity-based data grouping method is proposed. On the basis of the work in Database Management System for Virtual Screening, the method takes advantage of the computational chemistry software Chemistry Development Kit and cluster analysis methods to process the chemical molecules data. Finally, we deploy and implement the data grouping method on the Hadoop distributed platform. The experimental results show that this data grouping method can improve the efficiency of molecular docking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mclnnes, C.: Virtual screening strategies in drug discovery. Current Opinion in Chemical Biology 11, 494–502 (2007)
Article Google Scholar
Conrad, M.: Molecular computing: the lock-key paradigm. Computer 25(11), 11–20 (1992)
Article Google Scholar
Beynon, M.D., Kurc, T., Catalyurek, U., Chang, C., Sussman, A., Saltz, J.: Distributed processing of very large datasets with DataCutter. Parallel Computing 27(11), 1457–1478 (2001)
Article MATH Google Scholar
Yi, Z.: The Rethinking of the Competitive Strategy Based on the Cannikin Law. Journal of Ningbo Institute of Education 2, 029 (2011)
Google Scholar
Khetan, A., Vivek, B., Gupta, S.C.: A Novel Survey on Load Balancing in Cloud Computing. International Journal of Engineering 2(2) (2013)
Google Scholar
Jingwei, L., Rongjing, H., Ruisheng, Z., Jiuqiang, C., Guangcai, L.: An Effective Data Management Solution for Distributed Virtual Screening. In: The 2012 IET International Conference on Frotier Computin., pp. 280–285 (2012)
Google Scholar
Maldonado, A.G., Doucet, J.P., Petitjean, M., Fan, B.T.: Molecular similarity and diversity in chemoinformatics from theory to applications. Molecular Diversity 10(1), 39–79 (2006)
Article Google Scholar
Johnson, M.A., Gerald, M.: Maggiora: Concepts and applications of molecular similarity, vol. 8. Wiley, New York (1990)
Google Scholar
Daylight Chemical Information Systems Int., http://www.daylight.com/
Barnard Chemical Information Ltd., http://www.bci.gb.com/
Tripos Inc., http://www.tripos.com/
White, T.: Hadoop: The definitive guide. O’Reilly Media, Inc. (2012)
Google Scholar
ZINC- A free database for virtural screening, http://zinc.docking.org/
PubChem, http://pubchem.ncbi.nlm.nih.gov/
Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 11(suppl. 12) (2010)
Google Scholar
Ellingson, S.R., Jerome, B.: High-throughput virtual molecular docking: Hadoop implementation of AutoDock4 on a private cloud. In: Proceedings of the Second International Workshop on Emerging Computational Methods for the life Sciences. ACM (2011)
Google Scholar
Holliday, J.D., Hu, C.Y., Peter, W.: Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Combinatorial Chemistry & High Throughput Screening 5(2), 155–166 (2002)
Article Google Scholar
Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., Willighagen, E.L.: Recent developments of the chemistry development kit (CDK) – an open-source Java library for chemo- and bioinformatics. Curr. Pharm. Des. 12(17), 2111–2120 (2006)
Article Google Scholar
Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttman, E., Willighagen, E.: The Chemistry Development Kit (CDK): an open-source Java library for Chemo-and Bioinformatics. J. Chem. Inf. Comput. Sci. 43(2), 493–500 (2003)
Article Google Scholar
Borthakur, D.: HDFS architecture guide. Hadoop Apache Project, http://hadoop.apache.org/common/docs/current/hdfs_design.pdf
Chen, X., Frank, K.B.: Asymmetry of chemical similarity. Chem. Med. Chem. 2(2), 180–182 (2007)
Article Google Scholar
Kaufman, L., Peter, J.R.: Finding groups in data: an introduction to cluster analysis, vol. 344. Wiley-Interscience (2009)
Google Scholar
Hai, M., Zhang, S., Zhu, L., Wang, Y.: A Survey of Distributed Clustering Algorithms. In: 2012 International Conference on Industrial Control and Electronics Engineering (ICICEE), pp. 1142–1145. IEEE (2012)
Google Scholar
Yuan, D., et al.: A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurrency and Computation: Practice and Experience 24(9), 956–976 (2012)
Article Google Scholar
Ping, S.H.E.N.: The Research on Mining High Dimensional Data. Computer Knowledge and Technology 6, 011 (2009)
Google Scholar
Zhou, T., Caflisch, A.: Data management system for distributed virtual screening. Journal of Chemical Information and Modeling 49(1), 145–152 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000, China
Ruisheng Zhang, Guangcai Liu, Rongjing Hu, Jiaxuan Wei & Juan Li

Authors

Ruisheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guangcai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Rongjing Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxuan Wei
View author publications
You can also search for this author in PubMed Google Scholar
Juan Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

US Air Force Office of Scientific Research, 106-0032, Tokyo, Japan
Hiroshi Motoda
School of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, China
Zhaohui Wu
Faculty of Engineering and Information Technology, University of Technology, Chippendale, 2008, Sydney, NSW, Australia
Longbing Cao
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, Canada
Osmar Zaiane
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Min Yao
School of Computer Science, Fudan University, 200433, Shanghai, China
Wei Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, R., Liu, G., Hu, R., Wei, J., Li, J. (2013). A Similarity-Based Grouping Method for Molecular Docking in Distributed System. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53914-5_47

Download citation

DOI: https://doi.org/10.1007/978-3-642-53914-5_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53913-8
Online ISBN: 978-3-642-53914-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics