Skip to main content

A Similarity-Based Grouping Method for Molecular Docking in Distributed System

  • Conference paper
Advanced Data Mining and Applications (ADMA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8346))

Included in the following conference series:

Abstract

Molecular docking is one main technique in Virtual Screening. During a molecular docking process, the molecule docking time presents serious diversity because of different chemical structures. The time diversity can cause certain nodes to overload, thereby reducing the data processing ability of the whole distributed molecular docking system. Therefore, a reasonable and efficient data grouping strategy is essential in the molecular docking system. In this paper, molecular structural similarity is researched in depth, and a similarity-based data grouping method is proposed. On the basis of the work in Database Management System for Virtual Screening, the method takes advantage of the computational chemistry software Chemistry Development Kit and cluster analysis methods to process the chemical molecules data. Finally, we deploy and implement the data grouping method on the Hadoop distributed platform. The experimental results show that this data grouping method can improve the efficiency of molecular docking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mclnnes, C.: Virtual screening strategies in drug discovery. Current Opinion in Chemical Biology 11, 494–502 (2007)

    Article  Google Scholar 

  2. Conrad, M.: Molecular computing: the lock-key paradigm. Computer 25(11), 11–20 (1992)

    Article  Google Scholar 

  3. Beynon, M.D., Kurc, T., Catalyurek, U., Chang, C., Sussman, A., Saltz, J.: Distributed processing of very large datasets with DataCutter. Parallel Computing 27(11), 1457–1478 (2001)

    Article  MATH  Google Scholar 

  4. Yi, Z.: The Rethinking of the Competitive Strategy Based on the Cannikin Law. Journal of Ningbo Institute of Education 2, 029 (2011)

    Google Scholar 

  5. Khetan, A., Vivek, B., Gupta, S.C.: A Novel Survey on Load Balancing in Cloud Computing. International Journal of Engineering 2(2) (2013)

    Google Scholar 

  6. Jingwei, L., Rongjing, H., Ruisheng, Z., Jiuqiang, C., Guangcai, L.: An Effective Data Management Solution for Distributed Virtual Screening. In: The 2012 IET International Conference on Frotier Computin., pp. 280–285 (2012)

    Google Scholar 

  7. Maldonado, A.G., Doucet, J.P., Petitjean, M., Fan, B.T.: Molecular similarity and diversity in chemoinformatics from theory to applications. Molecular Diversity 10(1), 39–79 (2006)

    Article  Google Scholar 

  8. Johnson, M.A., Gerald, M.: Maggiora: Concepts and applications of molecular similarity, vol. 8. Wiley, New York (1990)

    Google Scholar 

  9. Daylight Chemical Information Systems Int., http://www.daylight.com/

  10. Barnard Chemical Information Ltd., http://www.bci.gb.com/

  11. Tripos Inc., http://www.tripos.com/

  12. White, T.: Hadoop: The definitive guide. O’Reilly Media, Inc. (2012)

    Google Scholar 

  13. ZINC- A free database for virtural screening, http://zinc.docking.org/

  14. PubChem, http://pubchem.ncbi.nlm.nih.gov/

  15. Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 11(suppl. 12) (2010)

    Google Scholar 

  16. Ellingson, S.R., Jerome, B.: High-throughput virtual molecular docking: Hadoop implementation of AutoDock4 on a private cloud. In: Proceedings of the Second International Workshop on Emerging Computational Methods for the life Sciences. ACM (2011)

    Google Scholar 

  17. Holliday, J.D., Hu, C.Y., Peter, W.: Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Combinatorial Chemistry & High Throughput Screening 5(2), 155–166 (2002)

    Article  Google Scholar 

  18. Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., Willighagen, E.L.: Recent developments of the chemistry development kit (CDK) – an open-source Java library for chemo- and bioinformatics. Curr. Pharm. Des. 12(17), 2111–2120 (2006)

    Article  Google Scholar 

  19. Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttman, E., Willighagen, E.: The Chemistry Development Kit (CDK): an open-source Java library for Chemo-and Bioinformatics. J. Chem. Inf. Comput. Sci. 43(2), 493–500 (2003)

    Article  Google Scholar 

  20. Borthakur, D.: HDFS architecture guide. Hadoop Apache Project, http://hadoop.apache.org/common/docs/current/hdfs_design.pdf

  21. Chen, X., Frank, K.B.: Asymmetry of chemical similarity. Chem. Med. Chem. 2(2), 180–182 (2007)

    Article  Google Scholar 

  22. Kaufman, L., Peter, J.R.: Finding groups in data: an introduction to cluster analysis, vol. 344. Wiley-Interscience (2009)

    Google Scholar 

  23. Hai, M., Zhang, S., Zhu, L., Wang, Y.: A Survey of Distributed Clustering Algorithms. In: 2012 International Conference on Industrial Control and Electronics Engineering (ICICEE), pp. 1142–1145. IEEE (2012)

    Google Scholar 

  24. Yuan, D., et al.: A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurrency and Computation: Practice and Experience 24(9), 956–976 (2012)

    Article  Google Scholar 

  25. Ping, S.H.E.N.: The Research on Mining High Dimensional Data. Computer Knowledge and Technology 6, 011 (2009)

    Google Scholar 

  26. Zhou, T., Caflisch, A.: Data management system for distributed virtual screening. Journal of Chemical Information and Modeling 49(1), 145–152 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, R., Liu, G., Hu, R., Wei, J., Li, J. (2013). A Similarity-Based Grouping Method for Molecular Docking in Distributed System. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53914-5_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53914-5_47

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53913-8

  • Online ISBN: 978-3-642-53914-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics