Skip to main content

An Efficiently Computable Graph-Based Metric for the Classification of Small Molecules

  • Conference paper
Discovery Science (DS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5255))

Included in the following conference series:

Abstract

In machine learning, there has been an increased interest in metrics on structured data. The application we focus on is drug discovery. Although graphs have become very popular for the representation of molecules, a lot of operations on graphs are NP-complete. Representing the molecules as outerplanar graphs, a subclass within general graphs, and using the block-and-bridge preserving subgraph isomorphism, we define a metric and we present an algorithm for computing it in polynomial time. We evaluate this metric and more generally also the block-and-bridge preserving matching operator on a large dataset of molecules, obtaining favorable results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ceroni, A., Costa, F., Frasconi, P.: Classification of small molecules by two- and three-dimensional decomposition kernels. Bioinformatics 23(16), 2038–2045 (2007)

    Article  Google Scholar 

  2. Shearer, K., Bunke, H., Venkatesh, S.: Video indexing and similarity retrieval by largest common subgraph detection using decision trees. Pattern Recognition Letters 34(5), 1075–1091 (2001)

    MATH  Google Scholar 

  3. Johnson, M., Maggiora, G.: Concepts and Applications of Molecular Similarity. John Wiley, Chichester (1990)

    Google Scholar 

  4. Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE Transactions on Knowledge and Data Engineering 17(8), 1036–1050 (2005)

    Article  Google Scholar 

  5. Raymond, J., Willett, P.: Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Computer-Aided Molecular Design 16, 521–533 (2002)

    Article  Google Scholar 

  6. Garey, M.R., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman and Co., New York (1979)

    MATH  Google Scholar 

  7. Mitchell, S.L.: Linear algorithms to recognize outerplanar and maximal outerplanar graphs. Information Processing Letters 9(5), 229–232 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  8. Horváth, T., Ramon, J., Wrobel, S.: Frequent subgraph mining in outerplanar graphs. In: Proceedings of the 12th ACM SIGKDD, pp. 197–206 (2006)

    Google Scholar 

  9. Hansch, C., Maolney, P., Fujita, T., R.M.: Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194, 178–180 (1962)

    Article  Google Scholar 

  10. Willett, P.: Similarity-based virtual screening using 2D fingerprints. Drug Discovery Today 11(23/24), 1046–1051 (2006)

    Article  Google Scholar 

  11. King, R., Muggleton, S., Srinivasan, A., Sternberg, M.: Structure-activity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. PNAS 93, 438–442 (1996)

    Article  Google Scholar 

  12. Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: Proceedings of the IEEE Int. Conf. on Data Mining, pp. 721–724. IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

  13. Swamidass, S.J., Chen, J., Bruand, J., Phung, P., Ralaivola, L., Baldi, P.: Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 21(suppl_1), 359–368 (2005)

    Article  Google Scholar 

  14. Raymond, J., Gardiner, E., Willett, P.: Rascal: Calculation of graph similarity using maximum common edge subgraphs. Computer Journal 45, 631–644 (2002)

    Article  MATH  Google Scholar 

  15. Diestel, R.: Graph Theory. Springer, Heidelberg (2000)

    Google Scholar 

  16. Syslo, M.: The subgraph isomorphism problem for outerplanar graphs. Theoretical Computer Science 17(1), 91–97 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  17. Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters 18, 689–694 (1997)

    Article  Google Scholar 

  18. Raymond, J., Willett, P.: Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases. Journal of Computer-Aided Design 16, 59–71 (2002)

    Article  Google Scholar 

  19. Munkres, J.: Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5, 32–38 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  20. Joachims, T.: Learning to Classify Text using Support Vector Machines: Methods, Theory, and Algorithms. Springer, Heidelberg (2002)

    Book  Google Scholar 

  21. Bringmann, B., Zimmermann, A., De Raedt, L., Nijssen, S.: Don’t be afraid of simpler patterns. In: Proc. of the 10th European Conf. on Principles and Practice of Knowledge Discovery in Databases, pp. 55–66 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Berlin Heidelberg

About this paper

Cite this paper

Schietgat, L., Ramon, J., Bruynooghe, M., Blockeel, H. (2008). An Efficiently Computable Graph-Based Metric for the Classification of Small Molecules. In: Jean-Fran, JF., Berthold, M.R., Horváth, T. (eds) Discovery Science. DS 2008. Lecture Notes in Computer Science(), vol 5255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88411-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88411-8_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88410-1

  • Online ISBN: 978-3-540-88411-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics