Abstract
Partitioning genomes into syntenic blocks has many uses in comparative genomics, such as inferring phylogenies or ancestral gene order. These blocks are usually required to contain enough genes to be qualified as syntenic. This leads to the problem of finding a common partition of the genomes in which the size of the blocks are above a certain threshold (usually at least two). When this is not feasible, one can ask to remove a minimum number of “noisy” genes so that such a partition exists. This is known as the Strip Recovery problem and is similar to the well-known Minimum Common String Partition problem, but also quite different since the latter has no restriction on the block sizes.
The algorithmic aspects of Strip Recovery are not well-understood, especially in the presence of duplicated genes. In this work, we present several new complexity results. First, we solve an open problem mentioned by Bulteau and Weller in 2019 who asked whether, in polynomial time, one can decide if a common partition with block sizes at least two can be achieved without deleting any genes. We show that the problem is actually NP-hard for any fixed list of allowed block sizes, unless blocks sizes are all multiples of the minimum allowed size. The problem is also hard on fixed alphabets if this list is part of the input. However, the minimum number of required gene deletions can be found in polynomial time if both the allowed blocks sizes and alphabet are fixed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Let us note that in practice, the same block can’t actually occur more than \(n/\min (F)\) times, where here n can be the minimum total length of \(\mathcal {W}_1\) or \(\mathcal {W}_2\). Hence, the smaller set \(\mathbb {C}(n/\min (F))\) should be considered in practice, but this does not help the complexity analysis.
References
Ramírez Alfonsín, J.L.: The Diophantine Frobenius Problem. OUP, Oxford (2005)
Anselmetti, Y., Berry, V., Chauve, C., Chateau, A., Tannier, E., Bérard, S.: Ancestral gene synteny reconstruction improves extant species scaffolding. BMC Genomics 16(10), 1–13 (2015)
Bourque, G., Pevzner, P.A., Tesler. G.: Reconstructing the genomic architecture of ancestral mammals: lessons from human, mouse, and rat genomes. Genome Res. 14(4):507–516 (2004)
Bulteau, L., Fertin, G., Jiang, M., Rusu, I.: Tractability and approximability of maximal strip recovery. Theor. Comput. Sci. 440–441, 14–28 (2012). https://doi.org/10.1016/j.tcs.2012.04.034
Moret, B.M.E.: Extending the reach of phylogenetic inference. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 1–2. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40453-5_1
Bulteau, L., Fertin, G., Rusu, I.: Maximal strip recovery problem with gaps: hardness and approximation algorithms. J. Discrete Algorithms 19, 1–22 (2013). https://doi.org/10.1016/j.jda.2012.12.006
Bulteau, L., Komusiewicz, C.: Minimum common string partition parameterized by partition size is fixed-parameter tractable. In: Proceedings of 25th ACM-SIAM Symposium on Discrete Algorithms, SODA2014, pp. 102–121. SIAM (2014)
Bulteau, L., Weller, M.: Parameterized algorithms in bioinformatics: an overview. Algorithms 12(12), 256 (2019). https://doi.org/10.3390/a12120256
Chen, X., Zheng, J., Zheng, F., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Computational Biology and Bioinformatics 2(4), 302–315 (2005). https://doi.org/10.1109/TCBB.2005.48. https://doi.org/10.1109/TCBB.2005.48
Chen, Z., Fu, B., Jiang, M., Zhu, B.: On recovering syntenic blocks from comparative maps. J. Comb. Optim. 18(3):307–318 (2009). https://doi.org/10.1007/s10878-009-9233-x
Choi, V., Zheng, C., Zhu, Q., Sankoff, D.: Algorithms for the extraction of synteny blocks from comparative maps. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS, vol. 4645, pp. 277–288. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74126-8_26
Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. ACM Trans. Algorithms 1(2), 350–366 (2005). https://doi.org/10.1145/1103963.1103971
Damaschke, P.: Minimum common string partition parameterized. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS, vol. 5251, pp. 87–98. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87361-7_8
Delabre, M., et al.: Evolution through segmental duplications and losses: a super-reconciliation approach. Algorithms Mol. Biol. 15, 1–15 (2020)
Drillon, G., Champeimont, R., Oteri, F., Fischer, G., Carbone, A.: Phylogenetic reconstruction based on synteny block and gene adjacencies. Mol. biol. Evol. 37(9), 2747–2762 (2020)
Eichler, E.E., Sankoff, D.: Structural dynamics of eukaryotic chromosome evolution. Science 301(5634), 793–797 (2003)
Ganczorz, M., Gawrychowski, P., Jez, A., Kociumaka, T.: Edit distance with block operations. In: Proceedings of ESA’2018, LIPIcs, vol. 112, pp. 33:1–33:14. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2018)
Garey, M.R., Johnson, D.S.: Computers and Intractability, vol. 174. Freeman, San Francisco (1979)
Goldstein, A., Kolman, P., Zheng. J.: Minimum common string partition problem: hardness and approximations. Eur. J. Comb. 12 (2005)
Goodstadt, L., Ponting, C.P.: Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human. PLoS Comput. Biol. 2(9), e133 (2006)
Hu, S., Li, W., Wang, J.: An improved kernel for the complementary maximal strip recovery problem. In: Xu, D., Du, D., Du, D. (eds.) COCOON 2015. LNCS, vol. 9198, pp. 601–608. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21398-9_47
Jiang, H., Guo, J., Zhu, D., Zhu, B.: A 2-approximation algorithm for the complementary maximal strip recovery problem. In: Pisanti, N., Pissis, S.P. (eds.) 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, 18–20 June 2019, Pisa, Italy, vol. 128 LIPIcs, pp. 5:1–5:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.CPM.2019.5
Jiang, H., Li, Z., Lin, G., Wang, L., Zhu, B.: Exact and approximation algorithms for the complementary maximal strip recovery problem. J. Comb. Optim. 23(4), 493–506 (2012). https://doi.org/10.1007/s10878-010-9366-y
Jiang, H., Zhu, B.: A linear kernel for the complementary maximal strip recovery problem. J. Comput. Syst. Sci. 80(7), 1350–1358 (2014). https://doi.org/10.1016/j.jcss.2014.03.005
Jiang, H., Zhu, B., Zhu, D., Zhu, H.: Minimum common string partition revisited. J. Comb. Optim. 23(4), 519–527 (2012). https://doi.org/10.1007/s10878-010-9370-2
Jiang, M.: Inapproximability of maximal strip recovery. Theor. Comput. Sci., 412(29), 3759–3774 (2011). https://doi.org/10.1016/j.tcs.2011.04.021
Lafond, M., Semeria, M., Swenson, K.M., Tannier, E., El-Mabrouk, N.: Gene tree correction guided by orthology. BMC Bioinform. 14, 1–9 (2013)
Lafond, M., Zhu, B.: Permutation-constrained common string partitions with applications. In: Lecroq, T., Touzet, H. (eds.) SPIRE 2021. LNCS, vol. 12944, pp. 47–60. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86692-1_5
Lechner, M., et al.: Orthology detection combining clustering and synteny for very large datasets. PLoS ONE 9(8), e105015 (2014)
Lewin, M.: A bound for a solution of a linear diophantine problem. J. Lond. Math. Soc. 2(1), 61–69 (1972)
Li, W., Liu, H., Wang, J., Xiang, L., Yang, Y.: An improved linear kernel for complementary maximal strip recovery: simpler and smaller. Theor. Comput. Sci., 786, 55–66 (2019). https://doi.org/10.1016/j.tcs.2018.04.020
Lin, G., Goebel, R., Li, Z., Wang, L.: An improved approximation algorithm for the complementary maximal strip recovery problem. J. Comput. Syst. Sci. 78(3), 720–730 (2012). https://doi.org/10.1016/j.jcss.2011.10.014
Cristopher Moore and John Michael Robson: Hard tiling problems with simple tiles. Discrete Comput. Geom. 26(4), 573–590 (2001)
Wang, L., Zhu, B.: On the tractability of maximal strip recovery. J. Comput. Biol. 17(7), 907–914 (2010). https://doi.org/10.1089/cmb.2009.0084
Zheng, C., Zhu, Q., Sankoff, D.: Removing noise and ambiguities from comparative maps in rearrangement analysis. IEEE ACM Trans. Comput. Biol. Bioinform. 4(4), 515–522 (2007). https://doi.org/10.1145/1322075.1322077
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lafond, M., Liyanage, A., Zhu, B., Zou, P. (2022). The Complexity of Finding Common Partitions of Genomes with Predefined Block Sizes. In: Jin, L., Durand, D. (eds) Comparative Genomics. RECOMB-CG 2022. Lecture Notes in Computer Science(), vol 13234. Springer, Cham. https://doi.org/10.1007/978-3-031-06220-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-06220-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06219-3
Online ISBN: 978-3-031-06220-9
eBook Packages: Computer ScienceComputer Science (R0)