Unconstrained generation of synthetic antibody–antigen structures to guide machine learning methodology for antibody specificity prediction

Robert, Philippe A.; Akbar, Rahmad; Frank, Robert; Pavlović, Milena; Widrich, Michael; Snapkov, Igor; Slabodkin, Andrei; Chernigovskaya, Maria; Scheffer, Lonneke; Smorodina, Eva; Rawat, Puneet; Mehta, Brij Bhushan; Vu, Mai Ha; Mathisen, Ingvild Frøberg; Prósz, Aurél; Abram, Krzysztof; Olar, Alex; Miho, Enkelejda; Haug, Dag Trygve Tryslew; Lund-Johansen, Fridtjof; Hochreiter, Sepp; Haff, Ingrid Hobæk; Klambauer, Günter; Sandve, Geir Kjetil; Greiff, Victor

doi:10.1038/s43588-022-00372-4

Resource
Published: 19 December 2022

Unconstrained generation of synthetic antibody–antigen structures to guide machine learning methodology for antibody specificity prediction

Nature Computational Science volume 2, pages 845–865 (2022)Cite this article

2786 Accesses
15 Citations
35 Altmetric
Metrics details

Subjects

A preprint version of the article is available at bioRxiv.

Abstract

Machine learning (ML) is a key technology for accurate prediction of antibody–antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody–antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Pipeline for the high-throughput generation of synthetic 3D antibody–antigen structure datasets suited for diverse ML formalizations.**

**Fig. 2: The Absolut! dataset reflects granular levels of the biological complexity of antibody–antigen binding.**

**Fig. 3: Classification of binding and non-binding antibody sequences with ML.**

**Fig. 4: Transferability of ML method rankings and impact of negative examples for pose classification.**

**Fig. 5: ML prediction of paratope–epitope pairs involved in antibody–antigen binding.**

Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries

Article Open access 12 June 2023

A dataset comprised of binding interactions for 104,972 antibodies against a SARS-CoV-2 peptide

Article Open access 26 October 2022

Extensive antibody search with whole spectrum black-box optimization

Article Open access 04 January 2024

Data availability

The Absolut! database is available at https://greifflab.org/Absolut and in the NIRD research data archive¹¹³. Source data for Figs. 2–5 is available with this paper.

Code availability

The Absolut! package is freely available at https://github.com/csi-greifflab/Absolut/ and on Zenodo¹¹⁴.

References

Lu, R.-M. et al. Development of therapeutic antibodies for the treatment of diseases. J. Biomed. Sci. 27, 1 (2020).
Article Google Scholar
Barlow, D. J., Edwards, M. S. & Thornton, J. M. Continuous and discontinuous protein antigenic determinants. Nature 322, 747–748 (1986).
Article Google Scholar
Sivalingam, G. N. & Shepherd, A. J. An analysis of B-cell epitope discontinuity. Mol. Immunol. 51, 304–309 (2012).
Article Google Scholar
Akbar, R., Robert, P. A., Pavlovic, M. & Jeliazkov, J. R. A compact vocabulary of paratope–epitope interactions enables predictability of antibody–antigen binding. Cell Rep. 34, 108856 (2021).
Article Google Scholar
Xu, J. L. & Davis, M. M. Diversity in the CDR3 region of VH is sufficient for most antibody specificities. Immunity 13, 37–45 (2000).
Article Google Scholar
Kunik, V., Ashkenazi, S. & Ofran, Y. Paratome: an online tool for systematic identification of antigen-binding regions in antibodies based on sequence or structure. Nucleic Acids Res. 40, W521–W524 (2012).
Article Google Scholar
Ferdous, S. & Martin, A. C. R. AbDb: antibody structure database-a database of PDB-derived antibody structures. Database 2018, (2018).
Dunbar, J. et al. SAbDab: the structural antibody database. Nucleic Acids Res. 42, D1140–D1146 (2014).
Article Google Scholar
Raybould, M. I. J., Kovaltsuk, A., Marks, C. & Deane, C. M. CoV-AbDab: the coronavirus antibody database. Bioinformatics 37, 734–735 (2020).
Article Google Scholar
Wardemann, H. & Busse, C. E. Novel approaches to analyze immunoglobulin repertoires. Trends Immunol. 38, 471–482 (2017).
Article Google Scholar
Shiakolas, A. R. et al. Efficient discovery of SARS-CoV-2-neutralizing antibodies via B cell receptor sequencing and ligand blocking. Nat. Biotechnol. 40(8):1270-1275 https://doi.org/10.1038/s41587-022-01232-2 (2022).
Laustsen, A. H., Greiff, V., Karatt-Vellatt, A., Muyldermans, S. & Jenkins, T. P. Animal immunization, in vitro display technologies, and machine learning for antibody discovery. Trends Biotechnol. https://doi.org/10.1016/j.tibtech.2021.03.003 (2021).
Kanyavuz, A., Marey-Jarossay, A., Lacroix-Desmazes, S. & Dimitrov, J. D. Breaking the law: unconventional strategies for antibody diversification. Nat. Rev. Immunol. 19, 355–368 (2019).
Article Google Scholar
Hoffecker, I. T., Shaw, A., Sorokina, V., Smyrlaki, I. & Högberg, B. Stochastic modeling of antibody binding predicts programmable migration on antigen patterns. Nat. Comput. Sci. 2, 179–192 (2022).
Article Google Scholar
Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods https://doi.org/10.1038/s41592-019-0666-6 (2019).
Pedotti, M., Simonelli, L., Livoti, E. & Varani, L. Computational docking of antibody–antigen complexes, opportunities and pitfalls illustrated by influenza hemagglutinin. Int. J. Mol. Sci. 12, 226 (2011).
Article Google Scholar
Yin, R., Feng, B. Y., Varshney, A. & Pierce, B. G. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Science. 2022; 31(8):e4379. https://doi.org/10.1002/pro.4379 (2021).
Raybould, M. I. J., Wong, W. K. & Deane, C. M. Antibody–antigen complex modelling in the era of immunoglobulin repertoire sequencing. Mol. Syst. Des. Eng. 4, 679–688 (2019).
Article Google Scholar
Norman, R. A. et al. Computational approaches to therapeutic antibody design: established methods and emerging trends. Brief. Bioinform. https://doi.org/10.1093/bib/bbz095 (2019).
Brown, A. J. et al. Augmenting adaptive immunity: progress and challenges in the quantitative engineering and analysis of adaptive immune receptor repertoires. Mol. Syst. Des. Eng. 4, 701–736 (2019).
Article Google Scholar
Greiff, V., Yaari, G. & Cowell, L. Mining adaptive immune receptor repertoires for biological and clinical information using machine learning. Curr. Opin. Syst. Biol. https://doi.org/10.1016/j.coisb.2020.10.010 (2020).
Fischman, S. & Ofran, Y. Computational design of antibodies. Curr. Opin. Struct. Biol. 51, 156–162 (2018).
Article Google Scholar
Sormanni, P., Aprile, F. A. & Vendruscolo, M. Third generation antibody discovery methods: in silico rational design. Chem. Soc. Rev. 47, 9137–9157 (2018).
Article Google Scholar
Burton, D. R. What Are the Most Powerful Immunogen Design Vaccine Strategies?: Reverse Vaccinology 2.0 Shows Great Promise. Cold Spring Harb. Perspect. Biol. 9, a030262 (2017).
Article Google Scholar
Daberdaku, S. & Ferrari, C. Antibody interface prediction with 3D Zernike descriptors and SVM. Bioinformatics 35, 1870–1876 (2019).
Article Google Scholar
Liberis, E., Velickovic, P., Sormanni, P., Vendruscolo, M. & Liò, P. Parapred: antibody paratope prediction using convolutional and recurrent neural networks. Bioinformatics 34, 2944–2950 (2018).
Article Google Scholar
Eguchi, R. R., Anand, N., Choe, C. A. & Huang, P.-S. IG-VAE: Generative Modeling of Immunoglobulin Proteins by Direct 3D Coordinate Generation. bioRxiv 2020.08.07.242347 (2020) https://doi.org/10.1101/2020.08.07.242347
Jespersen, M. C., Mahajan, S., Peters, B., Nielsen, M. & Marcatili, P. Antibody specific B-cell epitope predictions: leveraging information from antibody–antigen protein complexes. Front. Immunol. 10, 298 (2019).
Article Google Scholar
Liu, G. et al. Antibody complementarity determining region design using high-capacity machine learning. Bioinformatics 36, 2126–2133 (2020).
Article Google Scholar
Marks, C. & Deane, C. M. How repertoire data is changing antibody science. J. Biol. Chem. https://doi.org/jbc.REV120.010181 (2020).
Friedensohn, S. et al. Convergent selection in antibody repertoires is revealed by deep learning. Preprint at bioRxiv https://doi.org/10.1101/2020.02.25.965673 (2020).
Ripoll, D. R., Chaudhury, S. & Wallqvist, A. Using the antibody–antigen binding interface to train image-based deep neural networks for antibody-epitope classification. PLoS Comput. Biol. 17, e1008864 (2021).
Article Google Scholar
Ruffolo, J. A., Sulam, J. & Gray, J. J. Antibody structure prediction using interpretable deep learning. Patterns Volume 3, Issue 2,100406 (2022).
Del Vecchio, A., Deac, A., Liò, P. & Velickovic, P. Neural message passing for joint paratope–epitope prediction. Preprint at https://arxiv.org/abs/2106.00757 (2021).
Deac, A., Velickovic, P. & Sormanni, P. Attentive cross-modal paratope prediction. J. Comput. Biol. 26, 536–545 (2019).
Article Google Scholar
Mason, D. M. et al. Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-021-00699-9 (2021).
Sela-Culang, I., Ofran, Y. & Peters, B. Antibody specific epitope prediction—emergence of a new paradigm. Curr. Opin. Virol. 11, 98–102 (2015).
Article Google Scholar
Nimrod, G. et al. Computational design of epitope-specific functional antibodies. Cell Rep. 25, 2121–2131.e5 (2018).
Article Google Scholar
Xu, J. Distance-based protein folding powered by deep learning. Proc. Natl Acad. Sci. USA 116, 16856–16865 (2019).
Article Google Scholar
AlQuraishi, M. End-to-end differentiable learning of protein structure. Cell Syst. 8, 292–301.e3 (2019).
Article Google Scholar
Sverrisson, F., Feydy, J., Correia, B. & Bronstein, M. Fast end-to-end learning on protein surfaces. Preprint at bioRxiv https://doi.org/10.1101/2020.12.28.424589 (2020).
Narayanan, H. et al. Machine learning for biologics: opportunities for protein engineering, developability, and formulation. Trends Pharmacol. Sci. https://doi.org/10.1016/j.tips.2020.12.004 (2021).
Townshend, R. J. L., Bedi, R., Suriana, P. A. & Dror, R. O. End-to-end learning on 3D protein structure for interface prediction. Preprint at https://arxiv.org/abs/1807.01297 (2018).
Olimpieri, P. P., Chailyan, A., Tramontano, A. & Marcatili, P. Prediction of site-specific interactions in antibody–antigen complexes: the proABC method and server. Bioinformatics 29, 2285–2291 (2013).
Article Google Scholar
Pittala, S. & Bailey-Kellogg, C. Learning context-aware structural representations to predict antigen and antibody binding interfaces. Issue 13, Pages 3996–4003 (2020).
Lu, S., Li, Y., Wang, F., Nan, X. & Zhang, S. Leveraging sequential and spatial neighbors information by using CNNs linked with GCNs for paratope prediction. In IEEE/ACM Trans. Comput. Biol. Bioinform.Volume 19 issue 1 Page(s): 68 - 74 (2021).
Honda, S., Koyama, K. & Kotaro, K. Cross attentive antibody-antigen interaction prediction with multi-task learning. In 2021 ICML Workshop on Computational Biology.
Swindells, M. B. et al. abYsis: integrated antibody sequence and structure-management, analysis, and prediction. J. Mol. Biol. 429, 356–364 (2017).
Article Google Scholar
Rangel, M. A. et al. Fragment-based computational design of antibodies targeting structured epitopes. Preprint at bioRxiv https://doi.org/10.1101/2021.03.02.433360 (2021).
Kang, Y., Leng, D., Guo, J. & Pan, L. Sequence-based deep learning antibody design for in silico antibody affinity maturation. Preprint at https://arxiv.org/abs/2103.03724 (2021).
Akbar, R. et al. Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies. MAbs 14, 2008790 (2022).
Article Google Scholar
Prakash, E., Shrikumar, A. & Kundaje, A. Towards more realistic simulated datasets for benchmarking deep learning models in regulatory genomics. Preprint at bioRxiv https://doi.org/10.1101/2021.12.26.474224 (2021).
Cao, Y., Yang, P. & Yang, J. Y. H. A benchmark study of simulation methods for single-cell RNA sequencing data. Nat. Commun. 12, 6911 (2021).
Article Google Scholar
Schuler, A., Jung, K., Tibshirani, R., Hastie, T. & Shah, N. Synth-validation: selecting the best causal inference method for a given dataset. Preprint at https://arxiv.org/abs/1711.00083 (2017).
Sandve, G. K. & Greiff, V. Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking. Bioinformatics btac612 (2022).
Lavin, A. et al. Simulation intelligence: towards a new generation of scientific methods. Preprint at https://arxiv.org/abs/2112.03235 (2021).
Chen, V. et al. Best practices for interpretable machine learning in computational biology. Preprint at bioRxiv https://doi.org/10.1101/2022.10.28.513978 (2022).
Robert, P. A. & Meyer-Hermann, M. Ymir, A 3D structural affinity model for multi-epitope in silico germinal center simulations. Volume 24 issue 9, 102979 iScience (20201).
Mann, M., Saunders, R., Smith, C., Backofen, R. & Deane, C. M. Producing high-accuracy lattice models from protein atomic coordinates including side chains. Adv. Bioinformatics 2012, 148045 (2012).
Robinson, S. A. et al. Epitope profiling of coronavirus-binding antibodies using computational structural modelling. PLoS Comput Biol 17(12):e1009675 (2021).
Behrens, A-J. et al. Composition and antigenic effects of individual glycan sites of a trimeric HIV-1 envelope glycoprotein. Cell Rep. 14, 2695–2706 (2016).
Article Google Scholar
Miyazawa, S. & Jernigan, R. L. An empirical energy potential with a reference state for protein fold and sequence recognition. Proteins 36, 357–369 (1999).
Article Google Scholar
Ambrosetti, F., Jiménez-García, B., Roel-Touris, J. & Bonvin, A. M. J. Modeling antibody–antigen complexes by information-driven docking. Structure 28, 119–129.e2 (2020).
Article Google Scholar
Greiff, V. et al. Systems analysis reveals high genetic and antigen-driven predetermination of antibody repertoires throughout B cell development. Cell Rep. 19, 1467–1478 (2017).
Article Google Scholar
DeWitt, W. S. et al. A public database of memory and naive B-cell receptor sequences. PLoS ONE 11, e0160853 (2016).
Article Google Scholar
Pires, D. E. & Ascher, D. B. mCSM-AB: a web server for predicting antibody–antigen affinity changes upon mutation with graph-based signatures. Nucleic Acids Res. 44, W469–W473 (2016).
Article Google Scholar
Ju, F. et al. CopulaNet: learning residue co-evolution directly from multiple sequence alignment for protein structure prediction. Preprint at bioRxiv https://doi.org/10.1101/2020.10.06.327585 (2020).
Nogal, B. et al. Mapping polyclonal antibody responses in non-human primates vaccinated with HIV env trimer subunit vaccines. Cell Rep. 30, 3755–3765.e7 (2020).
Article Google Scholar
Adams, R. M., Kinney, J. B., Walczak, A. M. & Mora, T. Epistasis in a fitness landscape defined by antibody–antigen binding free energy. Cell Syst. 8, 86–93.e3 (2019).
Article Google Scholar
Hawkins-Hooker, A. et al. Generating functional protein variants with variational autoencoders. PLoS Comput. Biol. 17, e1008736 (2021).
Article Google Scholar
Angeletti, D. et al. Defining B cell immunodominance to viruses. Nat. Immunol. 18, 456–463 (2017).
Article Google Scholar
Angeletti, D. & Yewdell, J. W. Understanding and manipulating viral immunity: antibody immunodominance enters center stage. Trends Immunol. 39, 549–561 (2018).
Article Google Scholar
Kanduri, C. et al. Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification. Preprint at bioRxiv https://doi.org/10.1101/2021.05.23.445346 (2021).
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. Preprint at https://arxiv.org/abs/1703.01365 (2017).
Schneider, C., Buchanan, A., Taddese, B. & Deane, C. M. DLAB: deep learning methods for structure-based virtual screening of antibodies. Bioinformatics 38, 377–383 (2021).
Article Google Scholar
Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J. & Koes, D. R. Protein-ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57, 942–957 (2017).
Article Google Scholar
Leem, J., Dunbar, J., Georges, G., Shi, J. & Deane, C. M. ABodyBuilder: automated antibody structure prediction with data-driven accuracy estimation. MAbs 8, 1259–1268 (2016).
Article Google Scholar
Schneider, C. Deep Learning Algorithms for Predicting Association between Antibody Sequence, Structure, and Antibody Properties (Univ. Oxford, 2022).
Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. Preprint at https://arxiv.org/abs/1409.0473 (2014).
Vaswani, A. et al. Attention is all you need. Preprint at https://arxiv.org/abs/1706.03762 (2017).
Springer, I., Besser, H., Tickotsky-Moskovitz, N., Dvorkin, S. & Louzoun, Y. Prediction of specific TCR–peptide binding from large dictionaries of TCR–peptide pairs. Front. Immunol. 11:1803.doi: 10.3389/fimmu.2020.01803. eCollection 2020. (2020).
Moris, P. et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinform. 22, bbaa318 (2021).
Article Google Scholar
Khan, A. et al. AntBO: Towards real-world automated antibody design with combinatorial Bayesian optimisation. Preprint at https://arxiv.org/abs/2201.12570 (2022).
Akbar, R. et al. In silico proof of principle of machine learning-based antibody design at unconstrained scale. MAbs 14(1):2031482 (2022).
Robert, P. A., Marschall, A. L. & Meyer-Hermann, M. Induction of broadly neutralizing antibodies in germinal centre simulations. Curr. Opin. Biotechnol. 51, 137–145 (2018).
Article Google Scholar
Shaw, A. et al. Binding to nanopatterned antigens is dominated by the spatial tolerance of antibodies. Nat. Nanotechnol. 14, 184–190 (2019).
Article Google Scholar
Yaari, G. et al. Models of somatic hypermutation targeting and substitution based on synonymous mutations from high-throughput immunoglobulin sequencing data. Front. Immunol. 4, 358 (2013).
Article Google Scholar
Cassioli, A. et al. An algorithm to enumerate all possible protein conformations verifying a set of distance constraints. BMC Bioinform. 16, 23 (2015).
Article Google Scholar
Hollingsworth, S. A., Lewis, M. C., Berkholz, D. S., Wong, W.-K. & Karplus, P. A. (f,ψ)₂ Motifs: a purely conformation-based fine-grained enumeration of protein parts at the two-residue level. J. Mol. Biol. 416, 78–93 (2012).
Article Google Scholar
Lees, W. D., Stejskal, L., Moss, D. S. & Shepherd, A. J. Investigating substitutions in antibody–antigen complexes using molecular dynamics: a case study with broad-spectrum, influenza A antibodies. Front. Immunol. 8:143(2017).
Rodrigues, J. P. G. L., Teixeira, J. M. C., Trellet, M. & Alexandre, M. J. pdb-tools: a Swiss army knife for molecular structures. F1000Res. 7, 1961 (2018).
Article Google Scholar
Boyoglu-Barnum, S. et al. Glycan repositioning of influenza hemagglutinin stem facilitates the elicitation of protective cross-group antibody responses. Nat. Commun. 11, 791 (2020).
Article Google Scholar
Ward, A. B. & Wilson, I. A. The HIV-1 envelope glycoprotein structure: nailing down a moving target. Immunol. Rev. 275, 21–32 (2017).
Article Google Scholar
Andrabi, R. et al. Glycans function as anchors for antibodies and help drive HIV broadly neutralizing antibody development. Immunity 47, 524 (2017).
Article Google Scholar
Mosca, R., Céol, A., Stein, A., Olivella, R. & Aloy, P. 3did: a catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Res. 42, D374–D379 (2014).
Karp, R. M. Reducibility among combinatorial problems. In Complexity of Computer Computations 85–103 (1972).
The PyMOL Molecular Graphics System, Version 1.8 (Schrödinger) (2015); http://www.sciepub.com/reference/159710
Luong, M.-T., Pham, H. & Manning, C. D. Effective approaches to attention-based neural machine translation. Preprint at https://arxiv.org/abs/1508.04025 (2015).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Abadi, M. et al. TensorFlow: a system for large-scale machine learning. (2016). OSDI'16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation Pages 265–283
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
Sokolova, M. & Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process Manag. 45, 427–437 (2009).
Article Google Scholar
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Preprint at https://dl.acm.org/doi/10.5555/3454287.3455008 (2019).
Kingma, D. P. & Welling, M. An Introduction to variational autoencoders. Found. Trends Mach. Learn. (2019).
Higgins, I. et al. beta-VAE: learning basic visual concepts with a constrained variational framework. International Conference on Learning Representations (2016).
Dupont, E. Learning disentangled joint continuous and discrete representations. Adv. Neural Inf. Process. Syst. 31, (2018).
Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
Article Google Scholar
Katanforoush, A. & Shahshahani, M. Distributing points on the sphere, I. Exp. Math. 12, 199–209 (2003).
Article MathSciNet MATH Google Scholar
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2009).
Waskom, M. seaborn: statistical data visualization. J. Open Source Softw. 6, 3021 (2021).
Article Google Scholar
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
Article Google Scholar
Wagih, O. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33, 3645–3647 (2017).
Article Google Scholar
Robert, P. A., Akbar, R. & Greiff, V. Absolut! in silico antibody–antigen binding database. Nird Research Data Archive https://doi.org/10.11582/2021.00063 (2021).
Robert, P. A., Akbar, R. & Greiff, V. csi-greifflab/Absolut: v2.0 Zenodo https://doi.org/10.5281/zenodo.7415772 (2022).

Download references

Acknowledgements

We acknowledge generous support by The Leona M. and Harry B. Helmsley Charitable Trust (#2019PG-T1D011, to V.G.), UiO World-Leading Research Community (to V.G.), UiO:LifeScience Convergence Environment Immunolingo (to V.G., G.K.S. and I.H.H.), EU Horizon 2020 iReceptorplus (#825821) (to V.G.), a Research Council of Norway FRIPRO project (#300740, to V.G.), a Research Council of Norway IKTPLUSS project (#311341, to V.G. and G.K.S.), a Norwegian Cancer Society Grant (#215817, to V.G.), and Stiftelsen Kristian Gerhard Jebsen (K.G. Jebsen Coeliac Disease Research Centre) (to L.S. and G.K.S.). This work was not funded by Marie Skłodowska-Curie Actions while grant writing was supported by the German Arbeitsamt. This work was carried out on Immunohub e-Infrastructure funded by University of Oslo and jointly operated by GreiffLab and SandveLab (the authors) in close collaboration with the University Center for Information Technology, University of Oslo, IT-Department (USIT). We acknowledge T. Malliavin (Institut Pasteur, Paris, France) for comments and suggestions that helped in the analysis of the results, and C. Schneider for helping us reproduce the DLAB-VS pipeline.

Author information

These authors contributed equally: Philippe A. Robert, Rahmad Akbar.

Authors and Affiliations

Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
Philippe A. Robert, Rahmad Akbar, Robert Frank, Igor Snapkov, Andrei Slabodkin, Maria Chernigovskaya, Eva Smorodina, Puneet Rawat, Brij Bhushan Mehta, Ingvild Frøberg Mathisen, Fridtjof Lund-Johansen & Victor Greiff
Department of Informatics, University of Oslo, Oslo, Norway
Milena Pavlović, Lonneke Scheffer & Geir Kjetil Sandve
ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Michael Widrich, Sepp Hochreiter & Günter Klambauer
Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
Mai Ha Vu & Dag Trygve Tryslew Haug
Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
Aurél Prósz
The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
Krzysztof Abram
Department of Complex Systems in Physics, Eötvös Loránd University, Budapest, Hungary
Alex Olar
Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
Enkelejda Miho
aiNET GmbH, Basel, Switzerland
Enkelejda Miho
Swiss Institute of Bioinformatics, Lausanne, Switzerland
Enkelejda Miho
Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
Sepp Hochreiter
Department of Mathematics, University of Oslo, Oslo, Norway
Ingrid Hobæk Haff

Authors

Philippe A. Robert
View author publications
You can also search for this author in PubMed Google Scholar
Rahmad Akbar
View author publications
You can also search for this author in PubMed Google Scholar
Robert Frank
View author publications
You can also search for this author in PubMed Google Scholar
Milena Pavlović
View author publications
You can also search for this author in PubMed Google Scholar
Michael Widrich
View author publications
You can also search for this author in PubMed Google Scholar
Igor Snapkov
View author publications
You can also search for this author in PubMed Google Scholar
Andrei Slabodkin
View author publications
You can also search for this author in PubMed Google Scholar
Maria Chernigovskaya
View author publications
You can also search for this author in PubMed Google Scholar
Lonneke Scheffer
View author publications
You can also search for this author in PubMed Google Scholar
Eva Smorodina
View author publications
You can also search for this author in PubMed Google Scholar
Puneet Rawat
View author publications
You can also search for this author in PubMed Google Scholar
Brij Bhushan Mehta
View author publications
You can also search for this author in PubMed Google Scholar
Mai Ha Vu
View author publications
You can also search for this author in PubMed Google Scholar
Ingvild Frøberg Mathisen
View author publications
You can also search for this author in PubMed Google Scholar
Aurél Prósz
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Abram
View author publications
You can also search for this author in PubMed Google Scholar
Alex Olar
View author publications
You can also search for this author in PubMed Google Scholar
Enkelejda Miho
View author publications
You can also search for this author in PubMed Google Scholar
Dag Trygve Tryslew Haug
View author publications
You can also search for this author in PubMed Google Scholar
Fridtjof Lund-Johansen
View author publications
You can also search for this author in PubMed Google Scholar
Sepp Hochreiter
View author publications
You can also search for this author in PubMed Google Scholar
Ingrid Hobæk Haff
View author publications
You can also search for this author in PubMed Google Scholar
Günter Klambauer
View author publications
You can also search for this author in PubMed Google Scholar
Geir Kjetil Sandve
View author publications
You can also search for this author in PubMed Google Scholar
Victor Greiff
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Study conception: P.A.R., V.G.; study design: P.A.R., R.A., E.M., D.T.T.H., F.L.-J., S.H., I.H.H., G.K., G.K.S., V.G.; study implementation: P.A.R., R.A., R.F., M.P., M.W., I.S., A.P., K.A., A.O., A.S., M.C., L.S., I.F.M.; contributed data and analysis tools: E.S., P.R., B.B.M., M.H.V.; performed the analysis: P.A.R., R.A., R.F., I.F.M., K.A., A.O., A.S.; wrote the paper: P.A.R., R.A., R.F., M.P., M.W., I.S., A.S., M.C., L.S., E.S., P.R., B.B.M., M.H.V., I.F.M., G.K.S., V.G.

Corresponding authors

Correspondence to Philippe A. Robert or Victor Greiff.

Ethics declarations

Competing interests

E.M. declares holding shares in aiNET GmbH. V.G. declares advisory board positions in aiNET GmbH, Enpicom B.V, Specifica Inc, Adaptyv Biosystems, EVQLV, Omniscope, Diagonal Therapeutics, and Absci. V.G. is a consultant for Roche/Genentech, immunai, and Proteinea. The other authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Charlotte Deane, Pieter Meysman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Fernando Chirigati, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Results, Discussion, Figs. 1–24, Algorithms, Tables 1–6 and References.

Reporting Summary

Peer Review File

Source data

Source Data Fig. 2

One tab-separated text file per plot.

Source Data Fig. 3

One tab-separated text file per plot.

Source Data Fig. 4

tab-separated text file per plot + scripts in R.

Source Data Fig. 5

SOne tab-separated text file per plot.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Robert, P.A., Akbar, R., Frank, R. et al. Unconstrained generation of synthetic antibody–antigen structures to guide machine learning methodology for antibody specificity prediction. Nat Comput Sci 2, 845–865 (2022). https://doi.org/10.1038/s43588-022-00372-4

Download citation

Received: 16 July 2021
Accepted: 09 November 2022
Published: 19 December 2022
Issue Date: December 2022
DOI: https://doi.org/10.1038/s43588-022-00372-4

This article is cited by

Adaptive immune receptor repertoire analysis
- Vanessa Mhanna
- Habib Bashour
- Encarnita Mariotti-Ferrandiz
Nature Reviews Methods Primers (2024)
Linguistically inspired roadmap for building biologically reliable protein language models
- Mai Ha Vu
- Rahmad Akbar
- Dag Trygve Truslew Haug
Nature Machine Intelligence (2023)