Abstract
Computing genomic distances between whole genomes is a fundamental problem in comparative genomics. Recent researches have resulted in different genomic distance definitions: number of breakpoints, number of common intervals, number of conserved intervals, Maximum Adjacency Disruption number (MAD), etc. Unfortunately, it turns out that, in presence of duplications, most problems are NP-hard, and hence several heuristics have been recently proposed. However, while it is relatively easy to compare heuristics between them, until now very little is known about the absolute accuracy of these heuristics. Therefore, there is a great need for algorithmic approaches that compute exact solutions for these genomic distances. In this paper, we present a novel generic pseudo-boolean approach for computing the exact genomic distance between two whole genomes in presence of duplications, and put strong emphasis on common intervals under the maximum matching model. Of particular importance, we show very strong evidence that the simple LCS heuristic provides very good results on a well-known public benchmark dataset of γ-Proteobacteria.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barth, P.: A Davis-Putnam based enumeration algorithm for linear pseudo-Boolean optimization. Technical Report MPI-I-95-2-003, Max Planck Institut Informatik, 13 pages (2005)
Blin, G., Chauve, C., Fertin, G.: The breakpoint distance for signed sequences. In: Proc. 1st Algorithms and Computational Methods for Biochemical and Evolutionary Networks (CompBioNets), pp. 3–16. KCL publications (2004)
Blin, G., Chauve, C., Fertin, G.: Genes order and phylogenetic reconstruction: Application to γ-proteobacteria. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 11–20. Springer, Heidelberg (2005)
Blin, G., Rizzi, R.: Conserved interval distance computation between non-trivial genomes. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 22–31. Springer, Heidelberg (2005)
Bourque, G., Yacef, Y., El-Mabrouk, N.: Maximizing synteny blocks to identify ancestral homologs. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 21–35. Springer, Heidelberg (2005)
Bryant, D.: The complexity of calculating exemplar distances. In: Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families, pp. 207–212 (2000)
Chai, D., Kuehlmann, A.: A fast pseudo-boolean constraint solver, pp. 830–835 (2003)
Chauve, C., Fertin, G., Rizzi, R., Vialette, S.: Genomes containing duplicates are hard to compare. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 783–790. Springer, Heidelberg (2006)
Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(4), 302–315 (2005)
Eén, N., Sörensson., N.: Translating pseudo-boolean constraints into SAT. Journal on Satisfiability, Boolean Modeling and Computation 2, 1–26 (2006)
Lerat, E., Daubin, V., Moran, N.A.: From gene tree to organismal phylogeny in prokaryotes: the case of γ-proteobacteria. PLoS Biology 1(1), 101–109 (2003)
Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909–917 (1999)
Sankoff, D., Haque, L.: Power boosts for cluster tests. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 11–20. Springer, Heidelberg (2005)
Schrijver, A.: Theory of Linear and Integer Programming. John Wiley and Sons, Chichester (1998)
Sheini, H.M., Sakallah, K.A.: Pueblo: A hybrid pseudo-boolean SAT solver. Journal on Satisfiability, Boolean Modeling and Computation 2, 165–189 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Angibaud, S., Fertin, G., Rusu, I., Vialette, S. (2006). How Pseudo-boolean Programming Can Help Genome Rearrangement Distance Computation. In: Bourque, G., El-Mabrouk, N. (eds) Comparative Genomics. RCG 2006. Lecture Notes in Computer Science(), vol 4205. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11864127_7
Download citation
DOI: https://doi.org/10.1007/11864127_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44529-6
Online ISBN: 978-3-540-44530-2
eBook Packages: Computer ScienceComputer Science (R0)