Abstract
In the pattern matching on labeled graphs problem, given an edge labeled graph \(G = (V, E)\) and a string P, one seeks to identify if there exists a walk in the graph whose concatenation of edge labels (approximately) matches P. This is an elementary subproblem for utilizing genome graphs to represent collections of genetic sequences where patterns arise as reads in the sequencing data. Unfortunately, for general graphs, it is known that an algorithm running in \(O(|E||P|^{1-\varepsilon } + |E|^{1-\varepsilon }|P|)\) time for constant \(\varepsilon > 0\) is not possible under the Strong Exponential Time Hypothesis (SETH). De Bruijn graphs provide a valuable exception, allowing for a path exactly matching a pattern to be found in \(O(|E| + |P|)\) for constant-sized alphabets. This property has led de Bruijn graphs to be applied as indexes in the popular tool vg-toolkit. In this work, we consider the case where wildcards (that match with any edge label) are included in the pattern, and the graph is a de Bruijn graph. We demonstrate that adding these wildcards to the pattern is enough to again prove quadratic lower bounds conditioned on SETH for pattern matching on de Bruijn graphs, even when restricted to alphabets of size at most three and k-mer length \(\varTheta (\log |V|)\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abrahamson, K.R.: Generalized string matching. SIAM J. Comput. 16(6), 1039–1051 (1987)
Amir, A., Lewenstein, M., Lewenstein, N.: Pattern matching in hypertext. J. Algorithms 35(1), 82–99 (2000)
Clifford, P., Clifford, R.: Simple deterministic wildcard matching. Inf. Process. Lett. 101(2), 53–54 (2007)
Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Reif, J.H. (ed.) Proceedings on 34th Annual ACM Symposium on Theory of Computing, 19–21 May 2002, Montréal, Québec, Canada, pp. 592–601. ACM (2002)
Darbari, P., Gibney, D., Thankachan, S.V.: Quantum time complexity and algorithms for pattern matching on labeled graphs. In: Arroyuelo, D., Poblete, B. (eds.) String Processing and Information Retrieval - 29th International Symposium, SPIRE 2022, Concepción, Chile, 8–10 November 2022, Proceedings. Lecture Notes in Computer Science, vol. 13617, pp. 303–314. Springer (2022)
Equi, M., Mäkinen, V., Tomescu, A.I., Grossi, R.: On the complexity of string matching for graphs. ACM Trans. Algorithms 19(3), 21:1–21:25 (2023)
Gagie, T., Manzini, G., Sirén, J.: Wheeler graphs: A framework for BWT-based data structures. Theor. Comput. Sci. 698, 67–78 (2017)
Gibney, D., Hoppenworth, G., Thankachan, S.V.: Simple reductions from formula-sat to pattern matching on labeled graphs and subtree isomorphism. In: Le, H.V., King, V. (eds.) 4th Symposium on Simplicity in Algorithms, SOSA 2021, Virtual Conference, 11–12 January 2021, pp. 232–242. SIAM (2021)
Gibney, D., Thankachan, S.V., Aluru, S.: The complexity of approximate pattern matching on de Bruijn graphs. In: Pe’er, I. (ed.) Research in Computational Molecular Biology - 26th Annual International Conference, RECOMB 2022, San Diego, CA, USA, 22–25 May 2022, Proceedings. Lecture Notes in Computer Science, vol. 13278, pp. 263–278. Springer (2022)
Navarro, G.: Improved approximate pattern matching on hypertext. Theor. Comput. Sci. 237(1–2), 455–463 (2000)
Sirén, J.: Indexing variation graphs. In: Fekete, S.P., Ramachandran, V. (eds.) Proceedings of the Ninteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2017, Barcelona, Spain, Hotel Porta Fira, 17–18 January 2017, pp. 13–27. SIAM (2017)
Williams, R.: A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci. 348(2–3), 357–365 (2005)
Acknowledgement
S. Thankachan is partially supported by the U.S. National Science Foundation (NSF) award CCF-2316691.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ganguly, A., Gibney, D., Das, A.K., Thankachan, S.V. (2025). On the Hardness of Wildcard Pattern Matching on de Bruijn Graphs. In: Bansal, M.S., et al. Computational Advances in Bio and Medical Sciences. ICCABS 2023. Lecture Notes in Computer Science(), vol 14548. Springer, Cham. https://doi.org/10.1007/978-3-031-82768-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-82768-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-82767-9
Online ISBN: 978-3-031-82768-6
eBook Packages: Computer ScienceComputer Science (R0)