On the complexity of deriving score functions from examples for problems in molecular biology

Akutsu, Tatsuya; Yagiura, Mutsunori

doi:10.1007/BFb0055106

Tatsuya Akutsu¹ &
Mutsunori Yagiura²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1443))

Included in the following conference series:

International Colloquium on Automata, Languages, and Programming

162 Accesses
1 Citations

Abstract

Score functions (potential functions) have been used effectively in many problems in molecular biology. We propose a general method for deriving score functions that are consistent with example data, which yields polynomial time learning algorithms for several important problems in molecular biology (including sequence alignment). On the other hand, we show that deriving a score function for some problems (multiple alignment and protein threading) is computationally hard. However, we show that approximation algorithms for these optimization problems can also be used for deriving score functions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akutsu, T., Miyano, S.: On the approximation of protein threading. Proc. Int. Conf. on Computational Molecular Biology, ACM (1997) 3–8
Google Scholar
Akutsu, T., Tashimo, H.: Linear programming based approach to the derivation of a contact potential for protein threading. Proc. Pacific Symp. Biocomputing'98, World Scientific (1998) 413–424
Google Scholar
Amaldi, E., Kann, V.: On the approximability of finding maximum feasible subsystems of linear systems. LNCS, Vol. 775 (1994) 521–532
MATH MathSciNet Google Scholar
Bowie, J. U., Lüthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known three-dimensional structures. Science 253 (1991) 164–170
Google Scholar
Dayhoff, M. O., Schwartz, R. M. and Orcutt, B C.: A model of evolutionary change in proteins. Atlas of protein sequence and structure 5 (1978) 345–352
Google Scholar
Gusfield, D.: Efficient method for multiple sequence alignment with guaranteed error bounds. Bull. Math. Biol. 55 (1993) 141–154
Article MATH Google Scholar
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge Univ. Press (1997)
Google Scholar
Gusfield, D., Balasubramanian, K., Naor, D.: Parametric optimization of sequence alignment. Algorithmica 12 (1994) 312–326
Article MATH MathSciNet Google Scholar
Karmarkar, N. K.: A new polynomial-time algorithm for linear programming. Combinatorica 4 (1984) 373–395
MATH MathSciNet Google Scholar
Kyte, J., Doolittle, R. F.: A simple method of displaying the hydropathic character of a protein. J. Mol. Biol. 157 (1982) 105–132
Article Google Scholar
Laird, P. D.: Learning from Good and Bad Data. Kluwer Academic Publishers (1988).
Google Scholar
Lathrop, R. H.: The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng. 7 (1994) 1059–1068
Google Scholar
Lathrop, R. H., Smith, T. F.: Global optimum protein threading with gapped alignment and empirical pair score functions. J. Mol. Biol. 255 (1996) 641–665
Article Google Scholar
Maiorov, V. N., Crippen, G. M.: Contact potential that recognizes the correct folding of globular proteins. J. Mol. Biol. 277 (1992) 876–888
Article Google Scholar
Middendorf, M.: More on the complexity of common superstring and supersequence problems. Theoretical Computer Science 125 (1994) 205–228
Article MATH MathSciNet Google Scholar
Natarajan, B. K.: Machine Learning — A Theoretical Approach. Morgan Kaufmann (1991)
Google Scholar
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comp. Biol. 1 (1994) 337–348
Article Google Scholar
Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9 (1981) 133–148
Google Scholar

Download references

Author information

Authors and Affiliations

Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, 108-8639, Tokyo, Japan
Tatsuya Akutsu
Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Sakyo-ku, 606-8501, Kyoto, Japan
Mutsunori Yagiura

Authors

Tatsuya Akutsu
View author publications
You can also search for this author in PubMed Google Scholar
Mutsunori Yagiura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Kim G. Larsen Sven Skyum Glynn Winskel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Akutsu, T., Yagiura, M. (1998). On the complexity of deriving score functions from examples for problems in molecular biology. In: Larsen, K.G., Skyum, S., Winskel, G. (eds) Automata, Languages and Programming. ICALP 1998. Lecture Notes in Computer Science, vol 1443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0055106

Download citation

DOI: https://doi.org/10.1007/BFb0055106
Published: 26 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64781-2
Online ISBN: 978-3-540-68681-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics