Skip to main content

A Unified ILP Framework for Genome Median, Halving, and Aliquoting Problems Under DCJ

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10562))

Abstract

One of the key computational problems in comparative genomics is the reconstruction of genomes of ancestral species based on genomes of extant species. Since most dramatic changes in genomic architectures are caused by genome rearrangements, this problem is often posed as minimization of the number of genome rearrangements between extant and ancestral genomes. The basic case of three given genomes is known as the genome median problem. Whole genome duplications (WGDs) represent yet another type of dramatic evolutionary events and inspire the reconstruction of pre-duplicated ancestral genomes, referred to as the genome halving problem. Generalization of WGDs to whole genome multiplication events leads to the genome aliquoting problem.

In the present study, we provide polynomial-size integer linear programming formulations for the aforementioned problems. We further obtain such formulations for the restricted versions of the median and halving problems, which have been recently introduced for improving biological relevance.

P. Avdeyevand and N. Alexeev are contributed equally to this work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The strand of a gene is typically encoded by a sign. When the strands are ignored, the genomes are represented as permutations of (unsigned) genes.

  2. 2.

    Here we view genome P as evolving and P-edges as changing.

  3. 3.

    Note that V is determined by the genes present in the genomes \(P_1, P_2, \dots , P_q\), and thus V does not depend on the choice of M.

  4. 4.

    Under the inequality \(|a - b| \le c\), we understand a pair of linear inequalities \(a - b \le c\) and \(b - a \le c\).

  5. 5.

    In fact, they also define a genome \(X\in D_m(R)\) and a labeling of gene copies of A and X such that c(AX) is maximized.

  6. 6.

    In fact, beside R they also define a genome \(X\in D_m(R)\) and a labeling of gene copies of A and X such that \(c(A,X)+c(R,B)\) is maximized.

References

  1. Alekseyev, M.A., Pevzner, P.A.: Colored de Bruijn graphs and the genome halving problem. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 4(1), 98–107 (2007)

    Article  Google Scholar 

  2. Alekseyev, M.A., Pevzner, P.A.: Whole genome duplications, multi-break rearrangements, and genome halving problem. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), pp. 665–679. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (2007)

    Google Scholar 

  3. Alekseyev, M.A., Pevzner, P.A.: Multi-break rearrangements and chromosomal evolution. Theoret. Comput. Sci. 395(2), 193–202 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  4. Alexeev, N., Avdeyev, P., Alekseyev, M.A.: Comparative genomics meets topology: a novel view on genome median and halving problems. BMC Bioinf. 17(14), 418 (2016)

    Article  Google Scholar 

  5. Avdeyev, P., Jiang, S., Aganezov, S., Hu, F., Alekseyev, M.A.: Reconstruction of ancestral genomes in presence of gene gain and loss. J. Comput. Biol. 23(3), 150–164 (2016)

    Article  MathSciNet  Google Scholar 

  6. Bergeron, A., Mixtacki, J., Stoye, J.: A unifying view of genome rearrangements. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS, vol. 4175, pp. 163–173. Springer, Heidelberg (2006). doi:10.1007/11851561_16

    Chapter  Google Scholar 

  7. Caprara, A.: The reversal median problem. INFORMS J. Comput. 15(1), 93–113 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  8. Caprara, A., Lancia, G., Ng, S.K.: Fast practical solution of sorting by reversals. In: Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2000), pp. 12–21. Society for Industrial and Applied Mathematics (2000)

    Google Scholar 

  9. Dehal, P., Boore, J.L.: Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 3(10), e314 (2005)

    Article  Google Scholar 

  10. Dias, Z., de Souza, C.C.: Polynomial-sized ILP models for rearrangement distance problems. In: Brazilian Symposium On Bioinformatics, p. 74 (2007)

    Google Scholar 

  11. El-Mabrouk, N., Sankoff, D.: The reconstruction of doubled genomes. SIAM J. Comput. 32(3), 754–792 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. Feijão, P.: Reconstruction of ancestral gene orders using intermediate genomes. BMC Bioinf. 16(Suppl 14), S3 (2015)

    Article  Google Scholar 

  13. Feijão, P., Araujo, E.: Fast ancestral gene order reconstruction of genomes with unequal gene content. BMC Bioinf. 17(14), 413 (2016)

    Article  Google Scholar 

  14. Gagnon, Y., Savard, O.T., Bertrand, D., El-Mabrouk, N.: Advances on genome duplication distances. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 25–38. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16181-0_3

    Chapter  Google Scholar 

  15. Gao, N., Yang, N., Tang, J.: Ancestral genome inference using a genetic algorithm approach. PLoS ONE 8(5), 1–6 (2013)

    Google Scholar 

  16. Gavranović, H., Tannier, E.: Guided genome halving: provably optimal solutions provide good insights into the preduplication ancestral genome of saccharomyces cerevisiae. Pac. Symp. Biocomput. 15, 21–30 (2010)

    Google Scholar 

  17. Gurobi Optimization Inc: Gurobi optimizer reference manual (2016). http://www.gurobi.com

  18. Guyot, R., Keller, B.: Ancestral genome duplication in rice. Genome 47(3), 610–614 (2004)

    Article  Google Scholar 

  19. Haghighi, M., Sankoff, D.: Medians seek the corners, and other conjectures. BMC Bioinform. 13(19), 1 (2012)

    Google Scholar 

  20. Hartmann, T., Wieseke, N., Sharan, R., Middendorf, M., Bernt, M.: Genome Rearrangement with ILP. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017, in press). doi:10.1109/TCBB.2017.2708121

  21. Kellis, M., Birren, B.W., Lander, E.S.: Proof and evolutionary analysis of ancient genome duplication in the yeast saccharomyces cerevisiae. Nature 428(6983), 617–624 (2004)

    Article  Google Scholar 

  22. Lancia, A.C.G., Ng, S.K.: A column-generation based branch-and-bound algorithm for sorting by reversals. Math. Support Mol. Biol. 47, 213 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  23. Lancia, G., Rinaldi, F., Serafini, P.: A unified integer programming model for genome rearrangement problems. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015. LNCS, vol. 9043, pp. 491–502. Springer, Cham (2015). doi:10.1007/978-3-319-16483-0_48

    Google Scholar 

  24. Laohakiat, S., Lursinsap, C., Suksawatchon, J.: Duplicated genes reversal distance under gene deletion constraint by integer programming. In: 2008 2nd International Conference on Bioinformatics and Biomedical Engineering, pp. 527–530, May 2008

    Google Scholar 

  25. Mixtacki, J.: Genome halving under DCJ revisited. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 276–286. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69733-6_28

    Chapter  Google Scholar 

  26. Postlethwait, J.H., Yan, Y.L., Gates, M.A., Horne, S., Amores, A., Brownlie, A., Donovan, A., Egan, E.S., Force, A., Gong, Z., et al.: Vertebrate genome evolution and the zebrafish gene map. Nat. Genet. 18(4), 345–349 (1998)

    Article  Google Scholar 

  27. Rajan, V., Xu, A.W., Lin, Y., Swenson, K.M., Moret, B.M.: Heuristics for the inversion median problem. BMC Bioinf. 11(1), S30 (2010)

    Article  Google Scholar 

  28. Savard, O.T., Gagnon, Y., Bertrand, D., El-Mabrouk, N.: Genome halving and double distance with losses. J. Comput. Biol. 18(9), 1185–1199 (2011)

    Article  MathSciNet  Google Scholar 

  29. Shao, M., Lin, Y., Moret, B.M.: An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes. J. Comput. Biol. 22(5), 425–435 (2015)

    Article  MathSciNet  Google Scholar 

  30. Shao, M., Moret, B.M.: Comparing genomes with rearrangements and segmental duplications. Bioinformatics 31(12), i329 (2015)

    Article  Google Scholar 

  31. Suksawatchon, J., Lursinsap, C., Boden, M.: Computing the reversal distance between genomes in the presence of multi-gene families via binary integer programming. J. Bioinf. Comput. Biol. 05(01), 117–133 (2007)

    Article  Google Scholar 

  32. Swenson, K.M., Moret, B.M.: Inversion-based genomic signatures. BMC Bioinf. 10(1), 1 (2009)

    Article  Google Scholar 

  33. Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under different genomic distances. BMC Bioinf. 10(1), 1 (2009)

    Article  Google Scholar 

  34. The OEIS Foundation: The On-Line Encyclopedia of Integer Sequences. Published electronically at http://oeis.org (2017)

  35. Warren, R., Sankoff, D.: Genome aliquoting with double cut and join. BMC Bioinf. 10(1), S2 (2009)

    Article  Google Scholar 

  36. Warren, R., Sankoff, D.: Genome halving with double cut and join. J. Bioinf. Comput. Biol. 7(02), 357–371 (2009)

    Article  Google Scholar 

  37. Xu, A.W.: A fast and exact algorithm for the median of three problem: A graph decomposition approach. J. Comput. Biol. 16(10), 1369–1381 (2009)

    Article  MathSciNet  Google Scholar 

  38. Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16), 3340–3346 (2005)

    Article  Google Scholar 

  39. Zhang, M., Arndt, W., Tang, J.: An exact solver for the DCJ median problem. In: Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, p. 138. NIH Public Access (2009)

    Google Scholar 

  40. Zheng, C., Zhu, Q., Adam, Z., Sankoff, D.: Guided genome halving: hardness, heuristics and the history of the hemiascomycetes. Bioinformatics 24(13), i96 (2008)

    Article  Google Scholar 

  41. Zheng, C., Zhu, Q., Sankoff, D.: Genome halving with an outgroup. Evol. Bioinf. 2, 295–302 (2006)

    Google Scholar 

Download references

Acknowledgements

The work of PA and MAA is supported by the National Science Foundation under the grant No. IIS-1462107. The work of NA and YR is partially supported by the National Science Foundation under the grant No. DMS-1406984.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pavel Avdeyev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Avdeyev, P., Alexeev, N., Rong, Y., Alekseyev, M.A. (2017). A Unified ILP Framework for Genome Median, Halving, and Aliquoting Problems Under DCJ. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67979-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67978-5

  • Online ISBN: 978-3-319-67979-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics