Learning to Align: A Statistical Approach

Ricci, Elisa; de Bie, Tijl; Cristianini, Nello

doi:10.1007/978-3-540-74825-0_3

Elisa Ricci¹,
Tijl de Bie² &
Nello Cristianini^2,3

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4723))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1469 Accesses
1 Citations

Abstract

We present a new machine learning approach to the inverse parametric sequence alignment problem: given as training examples a set of correct pairwise global alignments, find the parameter values that make these alignments optimal. We consider the distribution of the scores of all incorrect alignments, then we search for those parameters for which the score of the given alignments is as far as possible from this mean, measured in number of standard deviations. This normalized distance is called the ‘Z-score’ in statistics. We show that the Z-score is a function of the parameters and can be computed with efficient dynamic programs similar to the Needleman-Wunsch algorithm. We also show that maximizing the Z-score boils down to a simple quadratic program. Experimental results demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Balaji, S., Sujatha, S., Kumar, S.S.C., Srinivasan, N.: PALI: a database of alignments and phylogeny of homologous protein structures. Nucleic Acids Research 29(1), 61–61 (2001)
Article Google Scholar
Eppstein, D.: Setting parameters by example. In: ACM Computing Research Repository. In: 40th IEEE Symp. Foundations of Comp. Sci., pp. 309–318 (1999), SIAM J. Computing 32(3), 643–653 (2003)
Google Scholar
Goldberg, M., Breimer, E.: Learning Significant Alignments: An Alternative to Normalized Local Alignment. In: Proceedings of the International Symposium on Methodologies for Intelligent Systems, pp. 37–45 (2002)
Google Scholar
Gusfield, D., Balasubramanian, K., Naor, D.: Parametric optimization of sequence alignment. Algorithmica 12, 312–326 (1994)
Article MATH MathSciNet Google Scholar
Gusfield, D., Stelling, P.: Parametric and inverse-parametric sequence alignment with XPARAL. Methods in Enzymology 266, 481–494 (1996)
Article Google Scholar
Kececioglu, J., Kim, E.: Simple and fast inverse alignment. In: Proc. of the 10th ACM Conference on Research in Computational Molecular Biology, pp. 441–455 (2006)
Google Scholar
Joachims, T., Galor, T., Elber, R.: Learning to Align Sequences: A Maximum-Margin Approach. In: Leimkuhler, B. (ed.) New Algorithms for Macromolecular Simulation. LNCSE, vol. 49, Springer, Heidelberg (2005)
Google Scholar
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)
Article Google Scholar
Pachter, L., Sturmfels, B.: Parametric inference for biological sequence analysis. In: Proceedings of the National Academy of Sciences USA, vol. 101(46), pp. 16138–16143 (2004)
Google Scholar
Sun, F., Fernandez-Baca, D., Yu, W.: Inverse parametric sequence alignment. Journal of Algorithms 53, 36–54 (2004)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electronic and Information Engineering, University of Perugia, 06125, Perugia, Italy
Elisa Ricci
Dept. of Engineering Mathematics, University of Bristol, Bristol, BS8 1TR, UK
Tijl de Bie & Nello Cristianini
Dept. of Computer Science, University of Bristol, Bristol, BS8 1TR, UK
Nello Cristianini

Authors

Elisa Ricci
View author publications
You can also search for this author in PubMed Google Scholar
Tijl de Bie
View author publications
You can also search for this author in PubMed Google Scholar
Nello Cristianini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michael R. Berthold John Shawe-Taylor Nada Lavrač

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ricci, E., de Bie, T., Cristianini, N. (2007). Learning to Align: A Statistical Approach. In: R. Berthold, M., Shawe-Taylor, J., Lavrač, N. (eds) Advances in Intelligent Data Analysis VII. IDA 2007. Lecture Notes in Computer Science, vol 4723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74825-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-74825-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74824-3
Online ISBN: 978-3-540-74825-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics