A Phase Transition for the Distribution of Matching Blocks

Claudia Neuhauser

doi:10.1017/S0963548300001930

A Phase Transition for the Distribution of Matching Blocks

Published online by Cambridge University Press: 12 September 2008

Claudia Neuhauser

Show author details

Claudia Neuhauser: Affiliation:
Department of Mathematics, University of Wisconsin, 480 Lincoln Drive, Madison, WI 53706, USA

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We show distributional results for the length of the longest matching consecutive subsequence between two independent sequences A1, A2, …, Am and B1, B2, …, Bn whose letters are taken from a finite alphabet. We assume that A1, A2, … are i.i.d. with distribution μ and B1, B2, … are i.i.d. with distribution ν. It is known that if μ and v are not too different, the Chen–Stein method for Poisson approximation can be used to establish distributional results. We extend these results beyond the region where the Chen–Stein method was previously successful. We use a combination of ‘matching by patterns’ results obtained by Arratia and Waterman [1], and the Chen–Stein method to show that the Poisson approximation can be extended. Our method explains how the matching is achieved. This provides an explanation for the formulas in Arratia and Waterman [1] and thus answers one of the questions posed in comment F19 in Aldous [2]. Furthermore, in the case where the alphabet consists of only two letters, the phase transition observed by Arratia and Waterman [1] for the strong law of large numbers extends to the distributional result. We conjecture that this phase transition on the distributional level holds for any finite alphabet.

Type: Research Article
Information: Combinatorics, Probability and Computing , Volume 5 , Issue 2 , June 1996 , pp. 139 - 159

DOI: https://doi.org/10.1017/S0963548300001930 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1996

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1]Arratia, R. and Waterman, M. S. (1985). Critical phenomena in sequence matching. Ann. Probab. 13, 1236–1249.CrossRef Google Scholar

[2]Aldous, D. (1989). Probability approximations via the Poisson clumping heuristic, Applied Mathematical Sciences 77, Springer-Verlag.CrossRef Google Scholar

[3]Arratia, R., Gordon, L. and Waterman, M. S. (1990). The Erdős–Rényi law in distribution, for coin tossing and sequence matching, Ann. Statist. 18, 539–570.CrossRef Google Scholar

[4]Erdős, P. and Rényi, A. (1970). On a new law of large numbers, J. Anal. Math. 22, 103–111.CrossRef Google Scholar

[5]Arratia, R., Goldstein, L. and Gordon, L. (1989). Two moments suffice for Poisson approximations: The Chen–Stein method, Ann. Probab. 17, 9–25.CrossRef Google Scholar

[6]Arratia, R., Gordon, L. and Waterman, M. S. (1986). An extreme value theory for sequence matching, Ann. Statist. 14, 971–993.CrossRef Google Scholar

[7]Karlin, S. and Ost, F. (1987). Counts of long aligned word matches among random letter sequences, Adv. Applic. Probab. 19, 293–351.CrossRef Google Scholar

[8]Karlin, S., Ghandour, G., Ost, F., Tavaré, S. and Korn, L. J. (1983). New approaches for computer analysis of nucleic acid sequences, Proc. Nat. Acad. Sci. USA 80, 5660–5664.CrossRef Google Scholar PubMed

[9]Stein, C. M. (1971). A bound for the error in the normal approximations to the distribution of a sum of dependent random variables, Proc. Sixth Berkeley Symp. Math. Statist. Probab. 3, 583–602.Google Scholar

[10]Chen, L. H. Y. (1975). Poisson approximation for dependent trials, Ann. Probab. 3, 534–545.CrossRef Google Scholar

[11]Dembo, A., Karlin, S. and Zeitouni, O. (1994). Limit distribution of maximal non-aligned two-sequence segmental score. Preprint.CrossRef Google Scholar

[12]Stein, C. M. (1986). Approximate Computation of Expectations, IMS, Hayward, CA.CrossRef Google Scholar

[13]Dembo, A. and Zeitouni, O. (1993). Large Deviations Techniques and Applications, Jones and Bartlett, Boston.Google Scholar

[14]Blom, G. and Thorburn, D. (1982). How many digits are required until given sequences are obtained? J. Appl. Probab. 19, 518–531.CrossRef Google Scholar

[15]Chryssaphinou, O. and Papastavridis, S. (1988). A limit theorem for the number of non-overlapping occurrences of a pattern in a sequence of independent trials, J. Appl. Probab. 25, 428–431.CrossRef Google Scholar

Article contents

A Phase Transition for the Distribution of Matching Blocks

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests