Skip to main content

Weighted Cumulative Voting-Based Aggregation Algorithm for Combining Multiple Clusterings of Chemical Structures

  • Conference paper
Information Retrieval Technology (AIRS 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8281))

Included in the following conference series:

  • 1465 Accesses

Abstract

Many consensus clustering methods have been applied for combining multiple clusterings of chemical structures such as co-association matrix-based, graph-based, hypergraph-based and voting-based methods. However, the voting-based consensus methods showed the best performance among these methods. In this paper, a Weighted Cumulative Voting-based Aggregation Algorithm (W-CVAA) was developed for enhancing the effectiveness of combining multiple clusterings of chemical structures. The effectiveness of clusterings was evaluated based on the ability of clustering to separate active from inactive molecules in each cluster and the results were compared to Ward’s method, which is the standard clustering method for chemoinformatics applications. The chemical dataset MDL Drug Data Report (MDDR) was used. Experimental results suggest that the weighted cumulative voting-based consensus method can improve the effectiveness of combining multiple clustering of chemical structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Downs, G.M., Barnard, J.M.: Clustering of Chemical Structures on the Basis of Two-Dimensional Similarity Measures. Journal of Chemical Information and Computer Science 32, 644–649 (1992)

    Article  Google Scholar 

  2. Willett, P.: Similarity and Clustering in Chemical Information Systems. Research Studies Press, Letchworth (1987)

    Google Scholar 

  3. Downs, G.M., Willett, P., Fisanick, W.: Similarity searching and clustering of chemical-structure databases using molecular property data. J. Chem. Inf. Comput. Sci. 34, 1094–1102 (1994)

    Article  Google Scholar 

  4. Brown, R.D., Martin, Y.C.: The information content of 2D and 3D structural descriptors relevant to ligand–receptor binding. J. Chem. Inf. Comput. Sci. 37, 1–9 (1997)

    Article  Google Scholar 

  5. Downs, G.M., Barnard, J.M.: Clustering methods and their uses in computational Chemistry. In: Lipkowitz, K.B., Boyd, D.B. (eds.) Reviews in Computational Chemistry, vol. 18. John Wiley (2002)

    Google Scholar 

  6. Holliday, J.D., Rodgers, S.L., Willet, P.: Clustering Files of chemical Structures Using the Fuzzy k-means Clustering Method. Journal of Chemical Information and Computer Science 44, 894–902 (2004)

    Article  Google Scholar 

  7. Varin, T., Bureau, R., Mueller, C., Willett, P.: Clustering files of chemical structures using the Székely–Rizzo generalization of Ward’s method. Journal of Molecular Graphics and Modeling 28(2), 187–195 (2009)

    Article  Google Scholar 

  8. Brown, R.D., Martin, Y.C.: Use of structure-activity data to compare structure-based clustering methods and descriptors for use in compound selection. J. Chem. Inf. Compute. Sci. 36, 572–584 (1996)

    Article  Google Scholar 

  9. Salim, N.: Analysis and Comparison of Molecular Similarity Measures. University of Sheffield. PhD Thesis (2003)

    Google Scholar 

  10. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: a review. ACM Computing Surveys 31 (1999)

    Google Scholar 

  11. Vega-Pons, S., Ruiz-Schulcloper, J.: A survey of clustering ensemble algorithms. International Journal of Pattern Recognition and Artificial Intelligence 25(3), 337–372 (2011)

    Article  MathSciNet  Google Scholar 

  12. Fischer, B., Buhmann, J.M.: Bagging for path-based clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(11), 1411–1415 (2003)

    Article  Google Scholar 

  13. Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9), 1090–1099 (2003)

    Article  Google Scholar 

  14. Evgenia, D., Andreas, W., Kurt, H.: A combination scheme for fuzzy clustering. International Journal of Pattern Recognition and Artificial Intelligence 16(7), 901–912 (2002)

    Article  Google Scholar 

  15. Gordon, A.D., Vichi, M.: Fuzzy partition models for fitting a set of partitions. Psychometrika 66(2), 229–248 (2001)

    Article  MathSciNet  Google Scholar 

  16. Topchy, A., Law, M., Jain, A.K., Fred, A.: Analysis of consensus partition in clustering ensemble. In: Proceedings of IEEE Intl. Conf. on Data Mining 2004, Brighton, UK, pp. 225–232 (2004)

    Google Scholar 

  17. Ayad, H.G., Kamel, M.S.: Cumulative voting consensus method for partitions with a variable number of clusters. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(1), 160–173 (2008)

    Article  Google Scholar 

  18. Ayad, H.G., Kamel, M.S.: On voting-based consensus of cluster ensembles. Patt. Recogn. 43, 1943–1953 (2010)

    Article  MATH  Google Scholar 

  19. Chu, C.-W., Holliday, J., Willett, P.: Combining multiple classifications of chemical structures using consensus clustering. Bioorgan. Med. Chem. 20(18), 5366–5371 (2012)

    Article  Google Scholar 

  20. Saeed, F., Salim, N., Abdo, A., Hentabli, H.: Graph-Based Consensus Clustering for Combining Multiple Clusterings of Chemical Structures. Journal of Molecular Informatics 32(2), 165–178 (2013)

    Article  Google Scholar 

  21. Strehl, A., Ghosh, J.: Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. J. Machine Learning Research 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  22. Saeed, F., Salim, N., Abdo, A.: Voting-based consensus clustering for combining multiple clusterings of chemical structures. J. Cheminf, 4, Article 37 (2012), http://www.jcheminf.com/content/4/1/37 (accessed March 20, 2013)

  23. Saeed, F., Salim, N., Abdo, A.: Consensus methods for combining multiple clusterings of chemical structures. Journal of Chemical Information and Modeling 53(5), 1026–1034 (2013)

    Article  Google Scholar 

  24. Sci Tegic Accelrys Inc., the MDL Drug Data Report (MDDR) database is available from at http://www.accelrys.com/ (accessed June 1, 2013)

  25. Pipeline Pilot, Accelrys Software Inc., San Diego (2008)

    Google Scholar 

  26. Ghose, A.K., Crippen, G.M.: Atomic physicochemical parameters for three-dimensional structure-directed quantitative structure−activity relationships 1. Partition coefficients as a measure of hydrophobicity. J. Comput. Chem. 7, 565–577 (1986)

    Article  Google Scholar 

  27. Ghose, A.K., Viswanadhan, V.N., Wendoloski, J.J.: Prediction of hydrophobic (lipophilic) properties of small organic molecules using fragmental methods: An analysis of ALOGP and CLOGP methods. J. Phys. Chem. A. 102, 3762–3772 (1998)

    Article  Google Scholar 

  28. Rogers, D., Hahn, M.: Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010)

    Article  Google Scholar 

  29. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (1991)

    Book  MATH  Google Scholar 

  30. Van Rijsbergen, C.J.: Information Retrieval. Butterworth, London (1979)

    Google Scholar 

  31. Varin, T., Saettel, N., Villain, J., Lesnard, A., Dauphin, F., Bureau, R., Rault, S.J.: 3D Pharmacophore, hierarchical methods, and 5-HT4 receptor binding data. Enzyme Inhib. Med. Chem. 23, 593–603 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saeed, F., Salim, N. (2013). Weighted Cumulative Voting-Based Aggregation Algorithm for Combining Multiple Clusterings of Chemical Structures. In: Banchs, R.E., Silvestri, F., Liu, TY., Zhang, M., Gao, S., Lang, J. (eds) Information Retrieval Technology. AIRS 2013. Lecture Notes in Computer Science, vol 8281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45068-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45068-6_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45067-9

  • Online ISBN: 978-3-642-45068-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics