Abstract
Let S be a string over an alphabet Σ = {σ 1, σ 2, …}. A Parikh-mapping maps a substring S′ of S to a |Σ|-length vector that contains, in location i of the vector, the count of σ i in S′. Parikh matching refers to the problem of finding all substrings of a text T which match to a given input |Σ|-length count vector.
In the streaming model one seeks space-efficient algorithms for problems in which there is one pass over the data. We consider Parikh matching in the streaming model. To make this viable we search for substrings whose Parikh-mappings approximately match the input vector. In this paper we present upper and lower bounds on the problem of approximate Parikh matching in the streaming model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Amir, A., Apostolico, A., Landau, G.M., Satta, G.: Efficient text fingerprinting via Parikh mapping. J. Discrete Algorithms 1(5-6), 409–421 (2003)
Breslauer, D., Galil, Z.: Real-Time Streaming String-Matching. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 162–172. Springer, Heidelberg (2011)
Burcsi, P., Cicalese, F., Fici, G., Liptak, Z.: On Approximate jumbled pattern matching in strings. Theory Comput. Syst. 50(1), 35–51 (2012)
Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)
Didier, G.: Common Intervals of Two Sequences. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 17–24. Springer, Heidelberg (2003)
Didier, G., Schmidt, T., Stoye, J., Tsur, D.: Character sets of strings. J. Discrete Algorithms 5(2), 330–340 (2007)
Ejaz, T.: Abelian pattern matching in strings. Ph.D. Dissertation, Faculty of Informatics, Technical University of Dortmund (2010)
Eres, R., Landau, G.M., Parida, L.: Permutation pattern discovery in biosequences. Journal of Computational Biology 11(6), 1050–1060 (2004)
Fellows, M.R., Fertin, G., Hermelin, D., Vialette, S.: Upper and lower bounds for finding connected motifs in vertex-colored graphs. J. Comput. Syst. Sci. 77(4), 799–811 (2011)
Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. JACM 53(3), 307–323 (2006)
Jokinen, P., Tarhio, J., Ukkonen, E.: A comparison of approximate string matching algorithms. Softw., Pract. Exper. 26(12), 1439–1458 (1996)
Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31(2), 249–260 (1987)
Karp, R.M., Miller, R.E., Rosenberg, A.L.: Rapid identification of repeated patterns in strings, trees and arrays. In: Proc. STOC, pp. 125–136 (1972)
Karkkainen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. JACM 53(6), 918–936 (2006)
Lacroix, V., Fernandes, C.G., Sagot, M.-F.: Reaction Motifs in Metabolic Networks. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 178–191. Springer, Heidelberg (2005)
Manber, U., Myers, E.W.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)
Motwani, R., Raghavan, P.: Randomized algorithms. Cambridge University Press (1995)
Porat, B., Porat, E.: Exact and approximate pattern matching in the streaming model. In: Proc. FOCS, pp. 315–323 (2009)
Schmidt, T., Stoye, J.: Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 347–358. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, LK., Lewenstein, M., Zhang, Q. (2012). Parikh Matching in the Streaming Model. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds) String Processing and Information Retrieval. SPIRE 2012. Lecture Notes in Computer Science, vol 7608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34109-0_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-34109-0_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34108-3
Online ISBN: 978-3-642-34109-0
eBook Packages: Computer ScienceComputer Science (R0)