Neural Model-Based Similarity Prediction for Compounds with Unknown Structures

Borzone, Eugenio; Di Persia, Leandro Ezequiel; Gerard, Matias

doi:10.1007/978-3-031-19647-8_6

Eugenio Borzone⁷,
Leandro Ezequiel Di Persia⁷ &
Matias Gerard⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1643))

Included in the following conference series:

International Conference on Applied Informatics

524 Accesses

Abstract

Compounds similarity analysis is widely used in many areas related to cheminformatics. Its calculation is straightforward when compounds structures are known. However, there are no methods to get similarity when this information is not available. Here we propose a novel approach to solve this problem. It generates compound representations from metabolic networks, and are use a neural network to predict similarity. The results show that generated embeddings preserve the neighborhood of the original metabolic graph, i.e. compounds participating into the same reactions are close together in the embedding space. Results for compounds with known structures show that the proposal allows to estimate the similarity with an error of less than 10%. In addition, a qualitative analysis of similarity shows that the prediction for compounds with unknown structure provides promising results using the generated embeddings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unsupervised network embeddings with node identity awareness

Article Open access 17 October 2019

Utilizing Low-Dimensional Molecular Embeddings for Rapid Chemical Similarity Search

Predicting protein network topology clusters from chemical structure using deep learning

Article Open access 15 July 2022

Notes

References

Bajusz, D., Rácz, A., Héberger, K.: Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J. Cheminformatics 7(1), 1–13 (2015). https://doi.org/10.1186/s13321-015-0069-3
Article Google Scholar
Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems, vol. 24. Curran Associates, Inc. (2011)
Google Scholar
Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 115–123. PMLR, Atlanta, Georgia, USA, 17–19 June 2013
Google Scholar
Eugenio, B., Gerard Matias, D.P.L.: Evaluación de un modelo neuronal para la estimación de similaridad entre compuestos a partir de representaciones one-hot. In: 52st JAIIO Jornadas Argentinas de Informática - ASAI (2022)
Google Scholar
Brown, R.D., Martin, Y.C.: Use of structure-activity data to compare structure-based clustering methods and descriptors for use in compound selection (1996)
Google Scholar
Covington, P., Adams, J., Sargin, E.: Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 191–198. ACM, Boston Massachusetts USA, September 2016. https://doi.org/10.1145/2959100.2959190
Durant, J.L., Leland, B.A., Henry, D.R., Nourse, J.G.: Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42(6), 1273–1280 (2002). https://doi.org/10.1021/ci010132r
Article Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM, San Francisco California USA, August 2016. https://doi.org/10.1145/2939672.2939754
Haykin, S.: Neural Networks: a Comprehensive Foundation. Prentice Hall PTR, Hoboken (1994)
Google Scholar
Hutter, F., Hoos, H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 32, pp. 754–762. PMLR, Bejing, China, 22–24 June 2014
Google Scholar
McShan, D.C., Rao, S., Shah, I.: PathMiner: predicting metabolic pathways by heuristic search. Bioinformatics 19(13), 1692–1698 (2003)
Article Google Scholar
Muegge, I., Mukherjee, P.: An overview of molecular fingerprint similarity search in virtual screening. Expert Opin. Drug Discov. 11, 137–148 (2016). https://doi.org/10.1517/17460441.2016.1117070
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710, August 2014. https://doi.org/10.1145/2623330.2623732, arXiv:1403.6652 [cs]
Rahman, S.A., Advani, P., Schunk, R., Schrader, R., Schomburg, D.: Metabolic pathway analysis web service (pathway hunter tool at CUBIC). Bioinformatics 21(7), 1189–1193 (2005)
Article Google Scholar
Steck, H., Baltrunas, L., Elahi, E., Liang, D., Raimond, Y., Basilico, J.: Deep learning for recommender systems: a Netflix case study. AI Mag. 42(3), 7–18 (2021). https://doi.org/10.1609/aimag.v42i3.18140, number: 3
Thomsen, J.U., Meyer, B.: Pattern recognition of the 1H NMR spectra of sugar alditols using a neural network. J. Magn. Reson. (1969) 84(1), 212–217 (1989). https://doi.org/10.1016/0022-2364(89)90021-8
Tiwari, S.P.: Social media based recommender system for e- commerce platforms. Int. J. Res. Eng. Sci. (IJRES) 87–98 (2021)
Google Scholar
Wager, S., Wang, S., Liang, P.S.: Dropout training as adaptive regularization, p. 9 (2013)
Google Scholar
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1225–1234. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939753
Willett, P., Barnard, J.M., Downs, G.M.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38, 983–996 (1998). https://doi.org/10.1021/ci9800211
Article Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 478–487. PMLR, New York, 20–22 June 2016
Google Scholar

Download references

Author information

Authors and Affiliations

Research Institute for Signals, Systems and Computational Intelligence (sınc(i)), FICH-UNL/CONICET, Ciudad Universitaria UNL, S3000, Santa Fe, Argentina
Eugenio Borzone, Leandro Ezequiel Di Persia & Matias Gerard

Authors

Eugenio Borzone
View author publications
You can also search for this author in PubMed Google Scholar
Leandro Ezequiel Di Persia
View author publications
You can also search for this author in PubMed Google Scholar
Matias Gerard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eugenio Borzone .

Editor information

Editors and Affiliations

Universidad Distrital Francisco Jose de Caldas, Bogota, Colombia
Hector Florez
Universidad Continental, Arequipa, Peru
Henry Gomez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borzone, E., Di Persia, L.E., Gerard, M. (2022). Neural Model-Based Similarity Prediction for Compounds with Unknown Structures. In: Florez, H., Gomez, H. (eds) Applied Informatics. ICAI 2022. Communications in Computer and Information Science, vol 1643. Springer, Cham. https://doi.org/10.1007/978-3-031-19647-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-19647-8_6
Published: 19 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19646-1
Online ISBN: 978-3-031-19647-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Neural Model-Based Similarity Prediction for Compounds with Unknown Structures