Skip to main content

Classifier-Free Graph Diffusion for Molecular Property Targeting

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14944))

  • 632 Accesses

Abstract

This work focuses on the task of property targeting: that is, generating molecules conditioned on target chemical properties to expedite candidate screening for novel drug and materials development. DiGress is a recent diffusion model for molecular graphs whose distinctive feature is allowing property targeting through classifier-based (CB) guidance. While CB guidance may work to generate molecular-like graphs, we hint at the fact that its assumptions apply poorly to the chemical domain. Based on this insight we propose a classifier-free DiGress (FreeGress), which works by directly injecting the conditioning information into the training process. CF guidance is convenient given its less stringent assumptions and since it does not require to train an auxiliary property regressor, thus halving the number of trainable parameters in the model. We empirically show that our model yields significant improvement in Mean Absolute Error with respect to DiGress on property targeting tasks on QM9 and ZINC-250k benchmarks. As an additional contribution, we propose a simple yet powerful approach to improve the chemical validity of generated samples, based on the observation that certain chemical properties such as molecular weight correlate with the number of atoms in molecules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Using packages such as RDKit [23] and psi4 [36].

References

  1. Aldeghi, M., Graff, D.E., Frey, N., et al.: Roughness of molecular property landscapes and its impact on modellability. J. Chem. Inf. Model. 62(19), 4660–4671 (2022). https://doi.org/10.1021/acs.jcim.2c00903

    Article  Google Scholar 

  2. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 214–223. PMLR (2017)

    Google Scholar 

  3. Austin, J., Johnson, D.D., Ho, J., Tarlow, D., van den Berg, R.: Structured denoising diffusion models in discrete state-spaces. In: Advances in Neural Information Processing Systems, vol. 34, pp. 17981–17993. Curran Associates, Inc. (2021)

    Google Scholar 

  4. Bacciu, D., Podda, M.: GraphGen-redux: a fast and lightweight recurrent model for labeled graph generation. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021). https://doi.org/10.1109/IJCNN52387.2021.9533743

  5. Corso, G., Cavalleri, L., Beaini, D., Liò, P., Veličković, P.: Principal neighbourhood aggregation for graph nets. In: Advances in Neural Information Processing Systems, vol. 33, pp. 13260–13271. Curran Associates, Inc. (2020)

    Google Scholar 

  6. Dara, S., Dhamercherla, S., Jadav, S.S., et al.: Machine learning in drug discovery: a review. Artif. Intell. Rev. 55(3), 1947–1999 (2021). https://doi.org/10.1007/s10462-021-10058-4

    Article  Google Scholar 

  7. De Cao, N., Kipf, T.: MolGAN: an implicit generative model for small molecular graphs. In: ICML 2018 workshop on Theoretical Foundations and Applications of Deep Generative Models (2018)

    Google Scholar 

  8. Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: Advances in Neural Information Processing Systems, vol. 34, pp. 8780–8794. Curran Associates, Inc. (2021)

    Google Scholar 

  9. Dwivedi, V.P., Bresson, X.: A generalization of transformer networks to graphs. Methods and Applications, AAAI Workshop on Deep Learning on Graphs (2021)

    Google Scholar 

  10. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc. (2014)

    Google Scholar 

  11. Goyal, N., Jain, H.V., Ranu, S.: GraphGen: a scalable approach to domain-agnostic labeled graph generation. In: Proceedings of The Web Conference 2020. pp. 1253–1263. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3366423.3380201

  12. Gu, S., et al.: Vector quantized diffusion model for text-to-image synthesis. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10686–10696 (2022). https://doi.org/10.1109/CVPR52688.2022.01043

  13. Guimaraes, G.L., Sanchez-Lengeling, B., Outeiral, C., Farias, P.L.C., Aspuru-Guzik, A.: Objective-reinforced generative adversarial networks (organ) for sequence generation models. arXiv preprint arXiv:1705.10843 (2018)

  14. Gómez-Bombarelli, R., Wei, J.N., Duvenaud, D., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4(2), 268–276 (2018). https://doi.org/10.1021/acscentsci.7b00572

    Article  Google Scholar 

  15. Haefeli, K.K., Martinkus, K., Perraudin, N., Wattenhofer, R.: Diffusion models for graphs benefit from discrete state spaces. In: The First Learning on Graphs Conference (2022)

    Google Scholar 

  16. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems. vol. 33, pp. 6840–6851. Curran Associates, Inc. (2020)

    Google Scholar 

  17. Ho, J., Salimans, T.: Classifier-free diffusion guidance. In: NeurIPS 2021 Workshop DGMs Applications (2022)

    Google Scholar 

  18. Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In: Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2323–2332. PMLR (2018)

    Google Scholar 

  19. Jin, W., Barzilay, R., Jaakkola, T.: Hierarchical generation of molecular graphs using structural motifs. In: Proceedings of the 37th International Conference on Machine Learning. ICML2020, JMLR.org (2020)

    Google Scholar 

  20. Johnson, D.D., Austin, J., van den Berg, R., Tarlow, D.: Beyond in-place corruption: insertion and deletion in denoising probabilistic models. In: ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (2021)

    Google Scholar 

  21. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2022)

  22. Krenn, M., Ai, Q., Barthel, S., et al.: Selfies and the future of molecular string representations. Patterns 3(10), 100588 (2022). https://doi.org/10.1016/j.patter.2022.100588

    Article  Google Scholar 

  23. Landrum, G.: RDKit: open-source cheminformatics software (2016)

    Google Scholar 

  24. Li, Y., Vinyals, O., Dyer, C., Pascanu, R., Battaglia, P.: Learning deep generative models of graphs. arXiv preprint arXiv:1803.03324 (2018)

  25. Liu, C., et al.: Generative diffusion models on graphs: methods and applications. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 6702–6711. International Joint Conferences on Artificial Intelligence Organization (2023). https://doi.org/10.24963/ijcai.2023/751, survey Track

  26. Liu, Y., Zhao, T., Ju, W., et al.: Materials discovery and design using machine learning. J. Materiomics 3(3), 159–177 (2017). https://doi.org/10.1016/j.jmat.2017.08.002

    Article  Google Scholar 

  27. Perez, E., Strub, F., de Vries, H., Dumoulin, V., Courville, A.: Film: visual reasoning with a general conditioning layer. In: Proceedings of the AAAI Conference on Artificial Intelligence 32(1) (2018). https://doi.org/10.1609/aaai.v32i1.11671

  28. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763. PMLR (2021)

    Google Scholar 

  29. Ramakrishnan, R., Dral, P.O., Rupp, M., von Lilienfeld, O.A.: Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1(1) (2014). https://doi.org/10.1038/sdata.2014.22

  30. Reddi, S., Kale, S., Kumar, S.: On the convergence of adam and beyond. In: International Conference on Learning Representations (2018)

    Google Scholar 

  31. Runcie, N.T., Mey, A.S.: SILVR: guided diffusion for molecule generation. J. Chem. Inf. Model. 63(19), 5996–6005 (2023). https://doi.org/10.1021/acs.jcim.3c00667

    Article  Google Scholar 

  32. Saharia, C., et al..: Photorealistic text-to-image diffusion models with deep language understanding. In: Advances in Neural Information Processing Systems, vol. 35, pp. 36479–36494. Curran Associates, Inc. (2022)

    Google Scholar 

  33. Shi*, C., Xu*, M., Zhu, Z., Zhang, W., Zhang, M., Tang, J.: GraphAF: a flow-based autoregressive model for molecular graph generation. In: International Conference on Learning Representations (2020)

    Google Scholar 

  34. Sousa, T., Correia, J., Pereira, V., Rocha, M.: Generative deep learning for targeted compound design. J. Chem. Inf. Model. 61(11), 5343–5361 (2021). https://doi.org/10.1021/acs.jcim.0c01496

    Article  Google Scholar 

  35. Tang, Z., Gu, S., Bao, J., et al.: Improved vector quantized diffusion models. arXiv preprint arXiv:2205.16007 (2023)

  36. Turney, J.M., Simmonett, A.C., Parrish, R.M., et al.: Psi4: an open-source ab initio electronic structure program. WIREs Comput. Mol. Sci. 2(4), 556–565 (2012). https://doi.org/10.1002/wcms.93

    Article  Google Scholar 

  37. Vignac, C., Krawczuk, I., Siraudin, A., Wang, B., Cevher, V., Frossard, P.: Digress: discrete denoising diffusion for graph generation. In: The Eleventh International Conference on Learning Representations (2023)

    Google Scholar 

  38. Weininger, D.: Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J. Chem. Inform. Comput. Sci. 28(1), 31–36 (1988). https://doi.org/10.1021/ci00057a005

  39. You, J., Liu, B., Ying, Z., Pande, V., Leskovec, J.: Graph convolutional policy network for goal-directed molecular graph generation. In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)

    Google Scholar 

  40. You, J., Ying, R., Ren, X., et al.: GraphRNN: generating realistic graphs with deep auto-regressive models. In: Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 5708–5717. PMLR (10–15 Jul 2018)

    Google Scholar 

Download references

Acknowledgements

Research partly funded by PNRR - M4C2 - Investimento 1.3, Partenariato Esteso PE00000013 -"FAIR - Future Artificial Intelligence Research" - Spoke 1 "Human-centered AI", funded by the European Commission under the NextGeneration EU programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Ninniri .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 261 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ninniri, M., Podda, M., Bacciu, D. (2024). Classifier-Free Graph Diffusion for Molecular Property Targeting. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14944. Springer, Cham. https://doi.org/10.1007/978-3-031-70359-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70359-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70358-4

  • Online ISBN: 978-3-031-70359-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics