Abstract.
This paper discusses how to count and generate strings that are ``distinct'' in two senses: p -distinct and b -distinct. Two strings x on alphabet A and x' on alphabet A' are said to be p -distinct iff they represent distinct ``patterns''; that is, iff there exists no one—one mapping from A to A' that transforms x into x' . Thus aab and baa are p -distinct while aab and ddc are p -equivalent. On the other hand, x and x' are said to be b -distinct iff they give rise to distinct border (failure function) arrays: thus aab with border array 010 is b -distinct from aba with border array 001 . The number of p -distinct (resp. b -distinct) strings of length n formed using exactly k different letters is the [k,n] entry in an infinite p' (resp. b' ) array. Column sums p[n] and b[n] in these arrays give the number of distinct strings of length n . We present algorithms to compute, in constant time per string, all p -distinct (resp. b -distinct) strings of length n formed using exactly k letters, and we also show how to compute all elements p'[k,n] and b'[k,n] . These ideas and results have application to the efficient generation of appropriate test data sets for many string algorithms.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received December 21, 1995; revised April 28, 1997.
Rights and permissions
About this article
Cite this article
Moore, D., Smyth, W. & Miller, D. Counting Distinct Strings . Algorithmica 23, 1–13 (1999). https://doi.org/10.1007/PL00009247
Issue Date:
DOI: https://doi.org/10.1007/PL00009247