Skip to main content

Learning theory toward Genome Informatics

  • Invited Papers
  • Conference paper
  • First Online:
Algorithmic Learning Theory (ALT 1993)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 744))

Included in the following conference series:

Abstract

This paper discusses some problems in Molecular Biology to which learning paradigms may be applicable. As a case, we present our recent study on knowledge discovery from amino acid sequences by PAC-learning paradigm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arikawa, S., Kuhara, S., Miyano, S., Mukouchi, Y., Shinohara, A., and Shinohara, T. [1993], Machine discovery of a negative motif from amino acid sequences by decision trees over regular patterns, New Generation Computing 11, 361–375.

    Google Scholar 

  2. Arikawa, S., Kuhara, S., Miyano, S., Shinohara, A., and Shinohara, T. [1992], A learning algorithm for elementary formal systems and its experiments on identification of transmembrane domains, Proc. 25th Hawaii International Conference on System Sciences, 675–684.

    Google Scholar 

  3. Asai, K., Hayamizu, S., and Onizuka, K. [1993], HMM with protein structure grammar, Proc. 26th Hawaii International Conference on System Sciences, 783–791.

    Google Scholar 

  4. Bairoch, A. [1991], PROSITE: a dictionary of sites and patterns in proteins, Nucleic Acids Res. 19, 2241–2245.

    PubMed  Google Scholar 

  5. Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M.K. [1989], Learnability and the Vapnik-Chervonenkis dimension, JACM, 36, 929–965.

    Google Scholar 

  6. Brunak, S., Engelbrecht, J., and Knudsen, S. [1990], Neural network detects erros in the assignment of mRNA splice sites, Nucleic Acids Res. 18, 4797–4801.

    PubMed  Google Scholar 

  7. Brunak, S., Engelbrecht, J., and Knudsen, S. [1991], Prediction of human mRNA donor and acceptor sites from the DNA sequence, J. Mol. Biol. 220, 49–65.

    PubMed  Google Scholar 

  8. Bucher, P. [1988], The eukaryote promoter database of the Weizmann Institute of Science, EMBL Nucleiotite Sequence Data Library Release 17, Heidelberg, Germany.

    Google Scholar 

  9. Chou, P.Y. and Fasman, G.D. [1978], Prediction of the secondary structure of proteins from their amino acid sequence, Advances in Enzymology 47, 45–147.

    Google Scholar 

  10. Cohen, R.E., Abarbanel, R.A., Kuntz, I.D., and Fletterick, R.J. [1986], Turn prediction in proteins using a pattern matching approach, Biochemistry 25, 266–275.

    PubMed  Google Scholar 

  11. Dowe, D.L., Oliver, J., Dix, T.I., Allison, L., and Wallace, C.S. [1993], A decision graph explanation of protein secondary structure prediction, Proc. 26th Hawaii International Conference on System Sciences, 669–678.

    Google Scholar 

  12. Emini, E.A., Hughes, J.V., Perlow, D.S., and Boger, J. [1985], Induction of hepatitis A virus-neutralizing antibody by a virus-specific peptide, J. Virol. 55, 836–839.

    PubMed  Google Scholar 

  13. Endgelman, D.M., Steiz, T.A., and Goldman, A. [1986], Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins, Ann. Rev. Biophys. Chem. 15, 321–354.

    Google Scholar 

  14. Folz, R.J. and Gordon, J.I. [1987], Computer-assisted predictions of signal peptibase processing sites, Biochem. Biophys. Res. Comm. 146, 870–877.

    PubMed  Google Scholar 

  15. Garnier, J., Osguthorpe, D.J., and Robon, B. [1978], Analysis of the accuracy and implication of simple methods for predicting the secondary structure of globular proteins, J. Mol. Biol. 120, 97–120.

    PubMed  Google Scholar 

  16. Gelfand, M.S. [1989], Statistical analysis of mammalian pre-mRNA splicing sites, Nucleic Acids Res. 17, 6369–6382.

    PubMed  Google Scholar 

  17. GenBank, Genetic Sequence Data Bank, National Institute of General Medical Science, NIH by contract to Intelligenetics, Inc., and Los Alamos Laboratory.

    Google Scholar 

  18. Gribskov, M. and Devereux, J. [1991], Sequence Analysis Primer, UWBC Biotechnical Resource Series, Macmillan Publishers Inc.

    Google Scholar 

  19. Haussler, D., Krogh, A., Mian, I.S., and Sjölander, K. [1993], Protein modeling using hidden Markov models: analysis of globins, Proc. 26th Hawaii International Conference on System Sciences, 792–802.

    Google Scholar 

  20. Harris, N.L. and Senapathy, P. [1990], Distribution and consensus of branch point signals in eukaryotic genes: a computerized statistical analysis, Nucleic Acids Res. 18, 3015–3019.

    PubMed  Google Scholar 

  21. Holley, L.H. and Karplus, M. [1989], Protein secondary structure prediction with a neural network, Proc. Nal. Acad. Sci. USA 86, 152–156.

    Google Scholar 

  22. Hopp, T.P. and Woods, K.R. [1981], Prediction of protein antigenic determinants from amino acid sequences, Proc. Natl. Acad. Sci. USA 78, 3824–3828.

    PubMed  Google Scholar 

  23. Iida, Y. and Sasaki, F. [1983], Recognition patterns for exon-intron junctions in higher organism as revealed by a computer search, J. Biochem. 94, 1731–1738.

    PubMed  Google Scholar 

  24. Jameson, B.A. and Wolf, H. [1988], The antigenic index: a novel algorithm for predicting antigenic determinants, Comput. Appl. Biosci. 4, 181–186.

    PubMed  Google Scholar 

  25. Karplus, P.A. and Schulz, G.E. [1985], Prediction of chain flexibility in proteins, Naturwissenschaften 72, 212–213.

    Article  Google Scholar 

  26. Kneller, D.G., Choen, F.E., and Langridge, R. [1990], Improvements in protein secondary structure prediction by an enhanced neural network, J. Mol. Biol. 214, 171–182.

    PubMed  Google Scholar 

  27. Kroeger, M., Wahl, R., and Rice, P. [1990], Compilation of DNA sequences of Escherichia coli (update 1990), Nucleic Acids Res. 18, 2549–2587.

    PubMed  Google Scholar 

  28. Kyte, J. and Doolittle, R.F. [1982], A simple method for displaying the hydropathic character of protein, J. Mol. Biol., 157, 105–132.

    PubMed  Google Scholar 

  29. Ladunga, I., Czako, F., Csabai, I., and Geszti, T. [1991], Improving signal peptide prediction accuracy by simulated neural network, Comput. Appl. Biosci. 7, 485–487.

    PubMed  Google Scholar 

  30. Lewin, B. [1987], Genes: Third Edition, John Wiley & Sons, Inc.

    Google Scholar 

  31. Miyano, S., Shinohara, A., and Shinohara, T. [1993], Learning elementary formal systems and an application to discovering motifs in proteins, Technical Report RIFIS-TR-CS-37, Research Institute of Fundamental Information Science, Kyushu University, revised in April, 1993 (former version: Proc. 2nd Algorithmic Learning Theory, 139–150, 1991).

    Google Scholar 

  32. Nakata, K., Kanehisa, M., DeLisi, C. [1985], Prediction of splice junctions in mRNA sequences, Nucleic Acids Res. 13, 5327–5340.

    PubMed  Google Scholar 

  33. Natarajan, B.K. [1989], On learning sets and functions, Machine Learning, 4, 67–97.

    Google Scholar 

  34. Pascarella, S. and Bossa, F. [1989], CLEAVAGE: a microcomputer program for predicting signal sequence cleavage sites, Comput. Appl. Biosci. 5, 53–54.

    PubMed  Google Scholar 

  35. Qian, N. and Sejnowski, T.J. [1988], Predicting the secondary structure of globular proteins using neural network models, J. Mol. Biol. 202, 865–884.

    PubMed  Google Scholar 

  36. Quinlan, J.R. [1986], Induction of decision trees, Machine Learning, 1, 81–106.

    Google Scholar 

  37. Senapathy, P., Shapiro, M.B., and Harris, N.L. [1990], Splice junctions, branch point sites, and exons: sequence statistics, identification, and applications to the genome project, Meth. Enzym. 183, 252–278.

    PubMed  Google Scholar 

  38. Shimozono, S. and Miyano, S. [1992], Complexity of finding alphabet indexing, Technical Report RIFIS-TR-CS-61, Research Institute of Fundamental Information Science, Kyushu University, August, 1992.

    Google Scholar 

  39. Shimozono, S., Shinohara, A., Shinohara, T., Miyano, S., Kuhara, S., and Arikawa, S. [1993], Finding alphabet indexing for decision trees over regular patterns: an approach to bioinformatical knowledge acquisition, Proc. 26th Hawaii International Conference on System Sciences, 763–772.

    Google Scholar 

  40. Shinohara, T. [1983], Polynomial time inference of extended regular pattern languages, Proc. RIMS Symp. Software Science and Engineering (Lecture Notes in Computer Science), 147, 115–127.

    Google Scholar 

  41. Staden, R. [1990], An improved sequence handling package that runs on the Apple Macintosh, Comput. Applic. Biosciences 6, 387–393.

    Google Scholar 

  42. Staden, R. [1990], Finding protein coding regions in genomic sequences, Meth. Enzym. 183, 163–180.

    PubMed  Google Scholar 

  43. Unger, R. and Moult, J. [1993], On the applicability of genetic algorithms to protein folding, Proc. 26th Hawaii International Conference on System Sciences, 715–725.

    Google Scholar 

  44. Valiant, L. [1984], A theory of the learnable, Commun. ACM, 27, 1134–1142.

    Article  Google Scholar 

  45. von Heijne, G. [1981], On the hydrophobic nature of signal sequences, Eur. J. Biochem. 116, 419–422.

    PubMed  Google Scholar 

  46. von Heijne, G. [1986], A new method for predicting signal sequences cleavage sites, Nucleic Acids Res. 14, 4683–4690.

    PubMed  Google Scholar 

  47. Watson,J.D., Hopkins, N.H., Robets, J.W., Steitz, J.A., and Weiner, A.M. [1987],Molecular Biology of The Gene: Fourth Edition, The Benjamin/Cummings Publishing Company, Inc.

    Google Scholar 

  48. Yanagihara, N., Suwa, M., and Mitaku, S. [1989], A theoretical method for distinguishing between soluble and membrane proteins, Biophysical Chemistry, 34, No. 1, 69–77.

    PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Klaus P. Jantke Shigenobu Kobayashi Etsuji Tomita Takashi Yokomori

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Miyano, S. (1993). Learning theory toward Genome Informatics. In: Jantke, K.P., Kobayashi, S., Tomita, E., Yokomori, T. (eds) Algorithmic Learning Theory. ALT 1993. Lecture Notes in Computer Science, vol 744. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57370-4_34

Download citation

  • DOI: https://doi.org/10.1007/3-540-57370-4_34

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-57370-8

  • Online ISBN: 978-3-540-48096-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics