Abstract
We present a new class of binary words: the prefix normal words. They are defined by the property that for any given length k, no factor of length k has more a’s than the prefix of the same length. These words arise in the context of indexing for jumbled pattern matching (a.k.a. permutation matching or Parikh vector matching), where the aim is to decide whether a string has a factor with a given multiplicity of characters, i.e., with a given Parikh vector. Using prefix normal words, we give the first non-trivial characterization of binary words having the same set of Parikh vectors of factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We discuss further properties and state open problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Acharya, J., Das, H., Milenkovic, O., Orlitsky, A., Pan, S.: Reconstructing a string from its substring compositions. In: Proceedings of IEEE International Symposium on Information Theory, ISIT 2010. pp. 1238–1242 (2010)
Berstel, J., Boasson, L.: Context-free languages. In: Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics (B), pp. 59–102. Elsevier, Amsterdam (1990)
Berstel, J., Boasson, L.: The set of Lyndon words is not context-free. Bull. Eur. Assoc. Theor. Comput. Sci. EATCS 63, 139–140 (1997)
Berstel, J., Perrin, D.: The origins of combinatorics on words. Eur. J. Comb. 28, 996–1022 (2007)
Böcker, S.: Simulating multiplexed SNP discovery rates using base-specific cleavage and mass spectrometry. Bioinformatics 23(2), 5–12 (2007)
Burcsi, P., Cicalese, F., Fici, G., Lipták, Zs.: On table arrangements, scrabble freaks, and jumbled pattern matching. In: Boldi, P., Gargano, L. (eds.) FUN 2010. LNCS, vol. 6099, pp. 89–101. Springer, Heidelberg (2010)
Champarnaud, J., Hansel, G., Perrin, D.: Unavoidable sets of constant length. Internat. J. Algebra Comput. 14, 241–251 (2004)
Cicalese, F., Fici, G., Lipták, Zs.: Searching for Jumbled Patterns in Strings. In: Holub, J., Zdárek, J. (eds.) Prague Stringology Conference, PSC 2009. Proceedings, pp. 105–117. Czech Tech. Univ. in Prague (2009)
Cieliebak, M., Erlebach, T., Lipták, Zs., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: combinatorics of weighted strings. Discrete Appl. Math. 137(1), 27–46 (2004)
Eres, R., Landau, G.M., Parida, L.: Permutation pattern discovery in biosequences. J. Comput. Biol. 11(6), 1050–1060 (2004)
Knuth, D.E.: Generating All Tuples and Permutations. The Art of Computer Programming, Vol. 4, Fascicle 2. Addison-Wesley, Reading (2005)
Lothaire, M.: Algebraic Combinatorics on Words. Encyclopedia of Mathematics and its Applications. Cambridge Univ. Press, Cambridge (2002)
Moosa, T.M., Rahman, M.S.: Sub-quadratic time and linear size data structures for permutation matching in binary strings . J. Discrete Algorithms (to appear)
Moosa, T.M., Rahman, M.S.: Indexing permutations for binary strings. Inf. Process. Lett. 110, 795–798 (2010)
Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comput. Surv. 39(1) (2007)
Ruskey, F., Savage, C., Wang, T.M.Y.: Generating necklaces. J. Algorithms 13(3), 414–430 (1992)
Sloane, N.J.A.: The On-Line Encyclopedia of Integer Sequences, Sequence A062692, available electronically at http://oeis.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fici, G., Lipták, Z. (2011). On Prefix Normal Words. In: Mauri, G., Leporati, A. (eds) Developments in Language Theory. DLT 2011. Lecture Notes in Computer Science, vol 6795. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22321-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-22321-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22320-4
Online ISBN: 978-3-642-22321-1
eBook Packages: Computer ScienceComputer Science (R0)