Skip to main content

Digital data structures and order statistics

  • Conference paper
  • First Online:
Algorithms and Data Structures (WADS 1989)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 382))

Included in the following conference series:

Abstract

This paper studies in a probabilistic framework some topics concerning the way words (strings) can overlap, and relationship of it to the height of digital trees associated with this set of words. A word is defined as a random sequence of (possible infinite) symbols over a finite alphabet. A key notion of alignment matrix {C ij }n i,j=1 is introduced where C ij is the length of the longest string that is prefix of the i-th and the j-th word. It is proved that the height of an associated digital tree is simply related to the alignment matrix through some order statistics. In particular, using this observation and proving some inequalities for order statistics, we establish that the height of a digital trie under independent model (i.e., all words are statistically independent), is asymptotically equal to 2 logα n where n is the number of words stored in the trie and α is a parameter of the probabilistic model. Some extensions of our basic model to other digital trees such as b-tries, tries with random number of keys (Poisson model) and suffix trees (dependent keys !) are also shortly discussed.

This research was supported in part by NSF grants NCR-8846388 and CCR-8900305.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley (1974).

    Google Scholar 

  2. D. Knuth, The Art of Computer Programming. Sorting and Searching, vol. III, Addison-Wesley (1973).

    Google Scholar 

  3. A. Apostolico, “The Myriad Virtues of Suffix Trees”, Combinatorial Algorithms on Words, 85–96, Springer-Verlag, ASI F12 (1985).

    Google Scholar 

  4. R. Fagin, J. Nievergelt, N. Pippenger and H. Strong, “Extendible Hashing: A Fast Access Method for Dynamic Files”, ACM TODS, 4, 315–344 (1979).

    Article  Google Scholar 

  5. P. Flajolet, “On the Performance Evaluation of Extendible Hashing and Trie Searching”, Acta Informatica, 20, 345–369 (1983).

    Article  Google Scholar 

  6. R. Gallager, Information Theory and Reliable Communications, John Wiley & Sons, New York (1968).

    Google Scholar 

  7. J. Capetanakis, “Tree Algorithms for Packet Broadcast Channels”, IEEE Trans. on Information Theory, IT-25, 505–525 (1979).

    Google Scholar 

  8. IEEE Transaction on Information Theory, IT-31, 2 (1985).

    Google Scholar 

  9. Ph. Jacquet and M. Regnier, “Trie Partitioning Process: Limiting Distributions”, in Lecture Notes in Computer Science, vol. 214, pp. 196–210, Springer Verlag, New York 1986.

    Google Scholar 

  10. L. Devroye, “A Probabilistic Analysis of the Height of Tries and of the Complexity of Trie Sort”, Acta Informatica, 21, 229–232 (1984).

    Google Scholar 

  11. B. Pittel, “Asymptotic Growth of a Class of Random Trees”, The Annalus of Probability, 13, 414–427 (1985).

    Google Scholar 

  12. B. Pittel, “Path in a Random Digital Tree: Limiting Distributions”, Adv. Appl. Probl., 18, 139–155 (1986).

    Google Scholar 

  13. M. Regnier, “On the Average Height of Trees in Digital Searching and Dynamic Hashing”, Inform. Processing Letters, 13, 64–66 (1981).

    Google Scholar 

  14. A. Yao, “A Note on the Analysis of Extendible Hashing”, Inform. Processing Letters, 11, 84–86 (1980).

    Google Scholar 

  15. W. Szpankowski, “On the Analysis of the Average Height of a Digital Trie: Another Approach”, Purdue University CSD TR-646 (1986); revision TR-816 (1988).

    Google Scholar 

  16. A. Apostolico and W. Szpankowski, “Self-Alignments in Words and Their Applications”, Purdue University CSD TR-732 (1987), submitted to a journal.

    Google Scholar 

  17. W. Szpankowski, “Some Results on V-ary Asymmetric Tries”, Journal of Algorithms, 9, 224–244 (1988).

    Google Scholar 

  18. P. Kirschenhofer, H. Prodinger and W. Szpankowski, “On the Variance of the External Path Length in a Symmetric Digital Trie”, Discrete Applied Mathematics, to appear.

    Google Scholar 

  19. H. David, Order Statistics, John Wiley & Sons, New York (1980).

    Google Scholar 

  20. J. Galambos, The Asymptotic Theory of Extreme Order Statistics, John Wiley & Sons, New York (1978).

    Google Scholar 

  21. T. Lai and H. Robbins, “A Class of Dependent Random Variables and Their Maxima”, Z. Wahrscheinlichkeitscheorie, 42, 89–111 (1978).

    Google Scholar 

  22. P. Billingsley, Probability and Measures, John Wiley & Sons, New York (1986).

    Google Scholar 

  23. W. Szpankowski, “(Probably) Optimal Solutions to Some Problems NOT Only on Graphs”, Purdue University CSD TR 780. 1988; revision 1989.

    Google Scholar 

  24. B. Silverman and T.C. Brown, “Short distances, flat triangles and Poisson limits”, J. Appl. Probab., 15, 815–825 (1978).

    Google Scholar 

  25. D. Aldous, Probability Approximations via the Poisson Clumping Heuristic, Springer Verlag, New York 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

F. Dehne J. -R. Sack N. Santoro

Rights and permissions

Reprints and permissions

Copyright information

© 1989 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Szpankowski, W. (1989). Digital data structures and order statistics. In: Dehne, F., Sack, J.R., Santoro, N. (eds) Algorithms and Data Structures. WADS 1989. Lecture Notes in Computer Science, vol 382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-51542-9_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-51542-9_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-51542-5

  • Online ISBN: 978-3-540-48237-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics