Skip to main content

Multiple Instance Learning Allows MHC Class II Epitope Predictions Across Alleles

  • Conference paper
Algorithms in Bioinformatics (WABI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5251))

Included in the following conference series:

Abstract

Human adaptive immune response relies on the recognition of short peptides through proteins of the major histocompatibility complex (MHC). MHC class II molecules are responsible for the recognition of antigens external to a cell. Understanding their specificity is an important step in the design of peptide-based vaccines. The high degree of polymorphism in MHC class II makes the prediction of peptides that bind (and then usually cause an immune response) a challenging task. Typically, these predictions rely on machine learning methods, thus a sufficient amount of data points is required. Due to the scarcity of data, currently there are reliable prediction models only for about 7% of all known alleles available.

We show how to transform the problem of MHC class II binding peptide prediction into a well-studied machine learning problem called multiple instance learning. For alleles with sufficient data, we show how to build a well-performing predictor using standard kernels for multiple instance learning. Furthermore, we introduce a new method for training a classifier of an allele without the necessity for binding allele data of the target allele. Instead, we use binding peptide data from other alleles and similarities between the structures of the MHC class II alleles to guide the learning process. This allows for the first time constructing predictors for about two thirds of all known MHC class II alleles. The average performance of these predictors on 14 test alleles is 0.71, measured as area under the ROC curve.

Availability: The methods are integrated into the EpiToolKit framework for which there exists a webserver at http://www.epitoolkit.org/ mhciimulti

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Topalian, S.L.: MHC class II restricted tumor antigens and the role of CD4+ T cells in cancer immunotherapy. Curr. Opin. Immunol. 6(5), 741–745 (1994)

    Article  Google Scholar 

  2. Robinson, J., Waller, M.J., Parham, P., Groot, N.d., et al.: IMGT/HLA and IMGT/MHC: sequence databases for the study of the major histocompatibility complex. Nucleic Acids Res. 31(1), 311–314 (2003)

    Article  Google Scholar 

  3. Peters, B., Sidney, J., Bourne, P., Bui, H.H., Buus, S., et al.: The immune epitope database and analysis resource: from vision to blueprint. PLoS Biol. 3(3), 91 (2005)

    Article  Google Scholar 

  4. Bui, H.H., Sidney, J., Peters, B., Sathiamurthy, M., Asabe, S., et al.: Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications. Immunogenetics 57(5), 304–314 (2005)

    Article  Google Scholar 

  5. Nielsen, M., Lundegaard, C., Lund, O.: Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method. BMC Bioinformatics 8, 238 (2007)

    Article  Google Scholar 

  6. Rammensee, H.G., Friede, T., Stevanović, S.: MHC ligands and peptide motifs: first listing. Immunogenetics 41(4), 178–228 (1995)

    Article  Google Scholar 

  7. Reche, P.A., Glutting, J.P., Zhang, H., Reinherz, E.L.: Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles. Immunogenetics 56(6), 405–419 (2004)

    Article  Google Scholar 

  8. Singh, H., Raghava, G.P.: ProPred: prediction of HLA-DR binding sites. Bioinformatics 17(12), 1236–1237 (2001)

    Article  Google Scholar 

  9. Sturniolo, T., Bono, E., Ding, J., Raddrizzani, L., Tuereci, O., et al.: Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices. Nat. Biotechnol. 17(6), 555–561 (1999)

    Article  Google Scholar 

  10. Nielsen, M., Lundegaard, C., Worning, P., Hvid, C.S., Lamberth, K., Buus, S., Brunak, S., Lund, O.: Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Bioinformatics 20(9), 1388–1397 (2004)

    Article  Google Scholar 

  11. Noguchi, H., Kato, R., Hanai, T., Matsubara, Y., Honda, H., Brusic, V., Kobayashi, T.: Hidden markov model-based prediction of antigenic peptides that interact with MHC class II molecules. J. Biosci. Bioeng. 94(3), 264–270 (2002)

    Article  Google Scholar 

  12. Karpenko, O., Shi, J., Dai, Y.: Prediction of MHC class II binders using the ant colony search strategy. Artif. Intell. Med. 35(1-2), 147–156 (2005)

    Article  Google Scholar 

  13. Brusic, V., Rudy, G., Honeyman, G., Hammer, J., Harrison, L.: Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network. Bioinformatics 14(2), 121–130 (1998)

    Article  Google Scholar 

  14. Guan, P., Doytchinova, I.A., Zygouri, C., Flower, D.R.: MHCPred: A server for quantitative prediction of peptide-MHC binding. Nucleic Acids Res. 31(13), 3621–3624 (2003)

    Article  Google Scholar 

  15. Dönnes, P., Kohlbacher, O.: SVMHC: a server for prediction of MHC-binding peptides. Nucleic Acids Res. 34, 194–197 (Web Server issue) (2006)

    Article  Google Scholar 

  16. Salomon, J., Flower, D.: Predicting class II MHC-peptide binding: a kernel based approach using similarity scores. BMC Bioinformatics 7(1), 501 (2006)

    Article  Google Scholar 

  17. Wan, J., Liu, W., Xu, Q., Ren, Y., Flower, D.R., Li, T.: SVRMHC prediction server for MHC-binding peptides. BMC Bioinformatics 7, 463 (2006)

    Article  Google Scholar 

  18. Wang, P., Sidney, J., Dow, C., Mothé, B., Sette, A., Peters, B.: A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach. PLoS Comput. Biol. 4(4), 1000048 (2008)

    Article  Google Scholar 

  19. Zaitlen, N., Reyes-Gomez, M., Heckerman, D., Jojic, N.: Shift-invariant adaptive double threading: Learning MHC II - peptide binding. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 181–195. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  20. DeLuca, D., Khattab, B., Blasczyk, R.: A modular concept of hla for comprehensive peptide binding prediction. Immunogenetics 59(1), 25–35 (2007)

    Article  Google Scholar 

  21. Jacob, L., Vert, J.P.: Efficient peptide-MHC-I binding prediction for alleles with few known binders. Bioinformatics 24(3), 358–366 (2008)

    Article  Google Scholar 

  22. Nielsen, M., Lundegaard, C., Blicher, T., Lamberth, K., Harndahl, M., Justesen, S., Røder, G., Peters, B., Sette, A., Lund, O., Buus, S.: NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence. PLoS ONE 2(8), 796 (2007)

    Article  Google Scholar 

  23. Gärtner, T., Flach, P.A., Kowalczyk, A., Smola, A.J.: Multi-instance kernels. In: Sammut, C., Hoffmann, A.G. (eds.) ICML, pp. 179–186. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  24. Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1-2), 31–71 (1997)

    Article  MATH  Google Scholar 

  25. Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12(5), 1207–1245 (2000)

    Article  Google Scholar 

  26. Dooly, D.R., Zhang, Q., Goldman, S.A., Amar, R.A.: Multiple-instance learning of real-valued data. J. Machine Learn Res. 3, 651–678 (2002)

    Article  Google Scholar 

  27. Ray, S., Page, D.: Multiple instance regression. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 425–432. Morgan Kaufmann Publishers Inc, San Francisco (2001)

    Google Scholar 

  28. Hammer, J., Belunis, C., Bolin, D., Papadopoulos, J., Walsky, R., Higelin, J., Danho, W., Sinigaglia, F., Nagy, Z.A.: High-affinity binding of short peptides to major histocompatibility complex class II molecules by anchor combinations. Proc. Natl. Acad. Sci. USA 91(10), 4456–4460 (1994)

    Article  Google Scholar 

  29. Venkatarajan, M.S., Braun, W.: New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical-chemical properties. Journal of Molecular Modeling 7(12), 445–453 (2001)

    Article  Google Scholar 

  30. Kawashima, S., Ogata, H., Kanehisa, M.: AAindex: Amino acid index database. Nucleic Acids Res. 27(1), 368–369 (1999)

    Article  Google Scholar 

  31. Hertz, T., Yanover, C.: Pepdist: A new framework for protein-peptide binding prediction based on learning peptide distance functions. BMC Bioinformatics 7 (suppl. 1), S3 (2006)

    Article  Google Scholar 

  32. Crooks, G.E., Hon, G., Chandonia, J.M., Brenner, S.E.: WebLogo: a sequence logo generator. Genome Res. 14(6), 1188–1190 (2004)

    Article  Google Scholar 

  33. Li, H., Jiang, T.: A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic mRNAs. In: RECOMB, pp. 262–271 (2004)

    Google Scholar 

  34. Schoenberg, I.J.: Metric spaces and positive definite functions. Trans. Amer. Math. Soc. 44(3), 522–536 (1938)

    Article  MATH  MathSciNet  Google Scholar 

  35. Consogno, G., Manici, S., Facchinetti, V., Bachi, A., Hammer, J., et al.: Identification of immunodominant regions among promiscuous HLA-DR-restricted CD4+ T-cell epitopes on the tumor antigen MAGE-3. Blood 101(3), 1038–1044 (2003)

    Article  Google Scholar 

  36. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

  37. Feldhahn, M., Thiel, P., Schuler, M.M., Hillen, N., Stevanović, S., et al.: EpiToolKit–a web server for computational immunomics. Nucleic Acids Res. (2008) (advanced access, doi:10.1093/nar/gkn229)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Keith A. Crandall Jens Lagergren

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pfeifer, N., Kohlbacher, O. (2008). Multiple Instance Learning Allows MHC Class II Epitope Predictions Across Alleles. In: Crandall, K.A., Lagergren, J. (eds) Algorithms in Bioinformatics. WABI 2008. Lecture Notes in Computer Science(), vol 5251. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87361-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87361-7_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87360-0

  • Online ISBN: 978-3-540-87361-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics