Skip to main content

The Complexity of Finding Common Partitions of Genomes with Predefined Block Sizes

  • Conference paper
  • First Online:
Comparative Genomics (RECOMB-CG 2022)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 13234))

Included in the following conference series:

  • 630 Accesses

Abstract

Partitioning genomes into syntenic blocks has many uses in comparative genomics, such as inferring phylogenies or ancestral gene order. These blocks are usually required to contain enough genes to be qualified as syntenic. This leads to the problem of finding a common partition of the genomes in which the size of the blocks are above a certain threshold (usually at least two). When this is not feasible, one can ask to remove a minimum number of “noisy” genes so that such a partition exists. This is known as the Strip Recovery problem and is similar to the well-known Minimum Common String Partition problem, but also quite different since the latter has no restriction on the block sizes.

The algorithmic aspects of Strip Recovery are not well-understood, especially in the presence of duplicated genes. In this work, we present several new complexity results. First, we solve an open problem mentioned by Bulteau and Weller in 2019 who asked whether, in polynomial time, one can decide if a common partition with block sizes at least two can be achieved without deleting any genes. We show that the problem is actually NP-hard for any fixed list of allowed block sizes, unless blocks sizes are all multiples of the minimum allowed size. The problem is also hard on fixed alphabets if this list is part of the input. However, the minimum number of required gene deletions can be found in polynomial time if both the allowed blocks sizes and alphabet are fixed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Let us note that in practice, the same block can’t actually occur more than \(n/\min (F)\) times, where here n can be the minimum total length of \(\mathcal {W}_1\) or \(\mathcal {W}_2\). Hence, the smaller set \(\mathbb {C}(n/\min (F))\) should be considered in practice, but this does not help the complexity analysis.

References

  1. Ramírez Alfonsín, J.L.: The Diophantine Frobenius Problem. OUP, Oxford (2005)

    Google Scholar 

  2. Anselmetti, Y., Berry, V., Chauve, C., Chateau, A., Tannier, E., Bérard, S.: Ancestral gene synteny reconstruction improves extant species scaffolding. BMC Genomics 16(10), 1–13 (2015)

    Google Scholar 

  3. Bourque, G., Pevzner, P.A., Tesler. G.: Reconstructing the genomic architecture of ancestral mammals: lessons from human, mouse, and rat genomes. Genome Res. 14(4):507–516 (2004)

    Google Scholar 

  4. Bulteau, L., Fertin, G., Jiang, M., Rusu, I.: Tractability and approximability of maximal strip recovery. Theor. Comput. Sci. 440–441, 14–28 (2012). https://doi.org/10.1016/j.tcs.2012.04.034

  5. Moret, B.M.E.: Extending the reach of phylogenetic inference. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 1–2. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40453-5_1

    Chapter  Google Scholar 

  6. Bulteau, L., Fertin, G., Rusu, I.: Maximal strip recovery problem with gaps: hardness and approximation algorithms. J. Discrete Algorithms 19, 1–22 (2013). https://doi.org/10.1016/j.jda.2012.12.006

  7. Bulteau, L., Komusiewicz, C.: Minimum common string partition parameterized by partition size is fixed-parameter tractable. In: Proceedings of 25th ACM-SIAM Symposium on Discrete Algorithms, SODA2014, pp. 102–121. SIAM (2014)

    Google Scholar 

  8. Bulteau, L., Weller, M.: Parameterized algorithms in bioinformatics: an overview. Algorithms 12(12), 256 (2019). https://doi.org/10.3390/a12120256

  9. Chen, X., Zheng, J., Zheng, F., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Computational Biology and Bioinformatics 2(4), 302–315 (2005). https://doi.org/10.1109/TCBB.2005.48. https://doi.org/10.1109/TCBB.2005.48

  10. Chen, Z., Fu, B., Jiang, M., Zhu, B.: On recovering syntenic blocks from comparative maps. J. Comb. Optim. 18(3):307–318 (2009). https://doi.org/10.1007/s10878-009-9233-x

  11. Choi, V., Zheng, C., Zhu, Q., Sankoff, D.: Algorithms for the extraction of synteny blocks from comparative maps. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS, vol. 4645, pp. 277–288. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74126-8_26

    Chapter  Google Scholar 

  12. Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. ACM Trans. Algorithms 1(2), 350–366 (2005). https://doi.org/10.1145/1103963.1103971

  13. Damaschke, P.: Minimum common string partition parameterized. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS, vol. 5251, pp. 87–98. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87361-7_8

  14. Delabre, M., et al.: Evolution through segmental duplications and losses: a super-reconciliation approach. Algorithms Mol. Biol. 15, 1–15 (2020)

    Google Scholar 

  15. Drillon, G., Champeimont, R., Oteri, F., Fischer, G., Carbone, A.: Phylogenetic reconstruction based on synteny block and gene adjacencies. Mol. biol. Evol. 37(9), 2747–2762 (2020)

    Article  Google Scholar 

  16. Eichler, E.E., Sankoff, D.: Structural dynamics of eukaryotic chromosome evolution. Science 301(5634), 793–797 (2003)

    Google Scholar 

  17. Ganczorz, M., Gawrychowski, P., Jez, A., Kociumaka, T.: Edit distance with block operations. In: Proceedings of ESA’2018, LIPIcs, vol. 112, pp. 33:1–33:14. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2018)

    Google Scholar 

  18. Garey, M.R., Johnson, D.S.: Computers and Intractability, vol. 174. Freeman, San Francisco (1979)

    Google Scholar 

  19. Goldstein, A., Kolman, P., Zheng. J.: Minimum common string partition problem: hardness and approximations. Eur. J. Comb. 12 (2005)

    Google Scholar 

  20. Goodstadt, L., Ponting, C.P.: Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human. PLoS Comput. Biol. 2(9), e133 (2006)

    Google Scholar 

  21. Hu, S., Li, W., Wang, J.: An improved kernel for the complementary maximal strip recovery problem. In: Xu, D., Du, D., Du, D. (eds.) COCOON 2015. LNCS, vol. 9198, pp. 601–608. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21398-9_47

  22. Jiang, H., Guo, J., Zhu, D., Zhu, B.: A 2-approximation algorithm for the complementary maximal strip recovery problem. In: Pisanti, N., Pissis, S.P. (eds.) 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, 18–20 June 2019, Pisa, Italy, vol. 128 LIPIcs, pp. 5:1–5:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.CPM.2019.5

  23. Jiang, H., Li, Z., Lin, G., Wang, L., Zhu, B.: Exact and approximation algorithms for the complementary maximal strip recovery problem. J. Comb. Optim. 23(4), 493–506 (2012). https://doi.org/10.1007/s10878-010-9366-y

  24. Jiang, H., Zhu, B.: A linear kernel for the complementary maximal strip recovery problem. J. Comput. Syst. Sci. 80(7), 1350–1358 (2014). https://doi.org/10.1016/j.jcss.2014.03.005

  25. Jiang, H., Zhu, B., Zhu, D., Zhu, H.: Minimum common string partition revisited. J. Comb. Optim. 23(4), 519–527 (2012). https://doi.org/10.1007/s10878-010-9370-2

  26. Jiang, M.: Inapproximability of maximal strip recovery. Theor. Comput. Sci., 412(29), 3759–3774 (2011). https://doi.org/10.1016/j.tcs.2011.04.021

  27. Lafond, M., Semeria, M., Swenson, K.M., Tannier, E., El-Mabrouk, N.: Gene tree correction guided by orthology. BMC Bioinform. 14, 1–9 (2013)

    Google Scholar 

  28. Lafond, M., Zhu, B.: Permutation-constrained common string partitions with applications. In: Lecroq, T., Touzet, H. (eds.) SPIRE 2021. LNCS, vol. 12944, pp. 47–60. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86692-1_5

    Chapter  Google Scholar 

  29. Lechner, M., et al.: Orthology detection combining clustering and synteny for very large datasets. PLoS ONE 9(8), e105015 (2014)

    Google Scholar 

  30. Lewin, M.: A bound for a solution of a linear diophantine problem. J. Lond. Math. Soc. 2(1), 61–69 (1972)

    Google Scholar 

  31. Li, W., Liu, H., Wang, J., Xiang, L., Yang, Y.: An improved linear kernel for complementary maximal strip recovery: simpler and smaller. Theor. Comput. Sci., 786, 55–66 (2019). https://doi.org/10.1016/j.tcs.2018.04.020

  32. Lin, G., Goebel, R., Li, Z., Wang, L.: An improved approximation algorithm for the complementary maximal strip recovery problem. J. Comput. Syst. Sci. 78(3), 720–730 (2012). https://doi.org/10.1016/j.jcss.2011.10.014

  33. Cristopher Moore and John Michael Robson: Hard tiling problems with simple tiles. Discrete Comput. Geom. 26(4), 573–590 (2001)

    Article  MathSciNet  Google Scholar 

  34. Wang, L., Zhu, B.: On the tractability of maximal strip recovery. J. Comput. Biol. 17(7), 907–914 (2010). https://doi.org/10.1089/cmb.2009.0084

  35. Zheng, C., Zhu, Q., Sankoff, D.: Removing noise and ambiguities from comparative maps in rearrangement analysis. IEEE ACM Trans. Comput. Biol. Bioinform. 4(4), 515–522 (2007). https://doi.org/10.1145/1322075.1322077

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Manuel Lafond or Binhai Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lafond, M., Liyanage, A., Zhu, B., Zou, P. (2022). The Complexity of Finding Common Partitions of Genomes with Predefined Block Sizes. In: Jin, L., Durand, D. (eds) Comparative Genomics. RECOMB-CG 2022. Lecture Notes in Computer Science(), vol 13234. Springer, Cham. https://doi.org/10.1007/978-3-031-06220-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06220-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06219-3

  • Online ISBN: 978-3-031-06220-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics