Skip to main content

Suffix Trees and Arrays

  • Reference work entry
  • First Online:
Encyclopedia of Algorithms

Years and Authors of Summarized Original Work

  • 1973; McCreight

  • 1973; Weiner

  • 1993; Manber, Myers

  • 1995; Ukkonen

The suffix tree is one of the oldest full-text inverted indexes and one of the most persistent subjects of study in the theory of algorithms. With extensions and refinements, including succinct and compressed variants that provide some of its expressive power in smaller space, it constitutes a fundamental conceptual tool in the design of string algorithms. The companion structure represented by the suffix array is as powerful as the suffix tree in many applications, but it requires significantly less space. The uses of these data structures are so numerous that it is difficult to account for all of them, while even more are being discovered. Salient applications include searching for a pattern in a text in time proportional to the size of the pattern, various computations on regularities such as repeats and palindromes within a text, statistical tables of substring occurrences,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,599.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Abouelhoda MI, Kurtz S, Ohlebusch E (2004) Replacing suffix trees with enhanced suffix arrays. J Discret Algorithms 2(1):53–86

    Article  MathSciNet  MATH  Google Scholar 

  2. Apostolico A (1985) The myriad virtues of subword trees. In: Apostolico A, Galil Z (eds) Combinatorial algorithms on words. Springer, Berlin/New York, pp 85–96

    Chapter  Google Scholar 

  3. Apostolico A, Bejerano G (2000) Optimal amnesic probabilistic automata or how to learn and classify proteins in linear time and space. J Comput Biol 7(3–4):381–393

    Article  Google Scholar 

  4. Apostolico A, Preparata FP (1983) Optimal off-line detection of repetitions in a string. Theor Comput Sci 22(3):297–315

    Article  MathSciNet  MATH  Google Scholar 

  5. Apostolico A, Bock ME, Lonardi S, Xu X (2000) Efficient detection of unusual words. J Comput Biol 7(1–2):71–94

    Article  Google Scholar 

  6. Apostolico A, Denas O et al (2008) Fast algorithms for computing sequence distances by exhaustive substring composition. Algorithms Mol Biol 3(13)

    Google Scholar 

  7. Beller T, Berger K, Ohlebusch E (2012) Space-efficient computation of maximal and supermaximal repeats in genome sequences. In: 19th international symposium on string processing and information retrieval (SPIRE 2012), Cartagena de Indias. Lecture notes in computer science, vol 7608. Springer, pp 99–110

    Google Scholar 

  8. Chi L, Hui K (1992) Color set size problem with applications to string matching. In: Combinatorial pattern matching, Tucson. Springer, pp 230–243

    Chapter  Google Scholar 

  9. Crochemore M, Hancart C, Lecroq T (2007) Algorithms on strings. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  10. Farach M (1997) Optimal suffix tree construction with large alphabets. In: Proceedings of the 38th annual symposium on foundations of computer science, 1997, Miami Beach. IEEE, pp 137–143

    Chapter  Google Scholar 

  11. Farach M, Noordewier M, Savari S, Shepp L, Wyner A, Ziv J (1995) On the entropy of DNA: algorithms and measurements based on memory and rapid convergence. In: Proceedings of the sixth annual ACM-SIAM symposium on discrete algorithms (SODA ’95), San Francisco. Society for Industrial and Applied Mathematics, pp 48–57

    Google Scholar 

  12. Ferragina P (1997) Dynamic text indexing under string updates. J Algorithms 22(2):296–328

    Article  MathSciNet  MATH  Google Scholar 

  13. Fiala ER, Greene DH (1989) Data compression with finite windows. Commun ACM 32(4):490–505. doi:10.1145/63334.63341, http://doi.acm.org/10.1145/63334.63341

  14. Gusfield D (1997) Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press, Cambridge/New York

    Book  MATH  Google Scholar 

  15. Gusfield D, Stoye J (2004) Linear time algorithms for finding and representing all the tandem repeats in a string. J Comput Syst Sci 69(4):525–546. doi:10.1016/j.jcss.2004.03.004, http://dx.doi.org/10.1016/j.jcss.2004.03.004

  16. Herold J, Kurtz S, Giegerich R (2008) Efficient computation of absent words in genomic sequences. BMC Bioinform 9(1):167

    Article  Google Scholar 

  17. Kärkkäinen J, Sanders P, Burkhardt S (2006) Linear work suffix array construction. J ACM 53(6):918–936

    Article  MathSciNet  MATH  Google Scholar 

  18. Kasai T, Lee G, Arimura H, Arikawa S, Park K (2001) Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Combinatorial pattern matching, Jerusalem. Springer, pp 181–192

    Chapter  Google Scholar 

  19. Kim DK, Sim JS, Park H, Park K (2005) Constructing suffix arrays in linear time. J Discret Algorithms 3(2):126–142

    Article  MathSciNet  MATH  Google Scholar 

  20. Ko P, Aluru S (2003) Space efficient linear time construction of suffix arrays. In: Combinatorial pattern matching, Morelia. Springer, pp 200–210

    Chapter  Google Scholar 

  21. Kurtz S (1999) Reducing the space requirement of suffix trees. Softw Pract Exp 29:1149–1171

    Article  Google Scholar 

  22. Larsson NJ (1996) Extended application of suffix trees to data compression. In: Data compression conference, Snowbird, pp 190–199

    Google Scholar 

  23. Lempel A, Ziv J (1976) On the complexity of finite sequences. IEEE Trans Inf Theory 22:75–81

    Article  MathSciNet  MATH  Google Scholar 

  24. Manber U, Myers G (1993) Suffix arrays: a new method for on-line string searches. SIAM J Comput 22(5):935–948

    Article  MathSciNet  MATH  Google Scholar 

  25. McCreight EM (1976) A space-economical suffix tree construction algorithm. J ACM 23(2):262– 272

    Article  MathSciNet  MATH  Google Scholar 

  26. Muthukrishnan S (2002) Efficient algorithms for document retrieval problems. In: Proceedings of the thirteenth annual ACM-SIAM symposium on discrete algorithms (SODA ’02), San Francisco. Society for Industrial and Applied Mathematics, Philadelphia, pp 657–666. http://dl.acm.org/citation.cfm?id=545381.545469

  27. Ohlebusch E, Gog S, Kügel A (2010) Computing matching statistics and maximal exact matches on compressed full-text indexes. In: XXth international symposium on string processing and information retrieval (SPIRE 2010), Los Cabos, pp 347–358

    Google Scholar 

  28. Puglisi SJ, Smyth WF, Turpin AH (2007) A taxonomy of suffix array construction algorithms. ACM Comput Surv 39(2):4

    Article  Google Scholar 

  29. Rodeh M, Pratt VR, Even S (1981) Linear algorithm for data compression via string matching. J ACM 28(1):16–24

    Article  MathSciNet  MATH  Google Scholar 

  30. Smola AJ, Vishwanathan S (2003) Fast kernels for string and tree matching. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems (NIPS ’03) 15, Vancouver. MIT, pp 585–592

    Google Scholar 

  31. Stoye J, Gusfield D (2002) Simple and flexible detection of contiguous repeats using a suffix tree. Theor Comput Sci 270(1):843–856

    Article  MathSciNet  MATH  Google Scholar 

  32. Ukkonen E (1995) On-line construction of suffix trees. Algorithmica 14(3):249–260

    Article  MathSciNet  MATH  Google Scholar 

  33. Weiner P (1973) Linear pattern matching algorithms. In: IEEE conference record of 14th annual symposium on switching and automata theory (SWAT ’08), Iowa City, 1973. IEEE, pp 1–11

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Apostolico .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Apostolico, A., Cunial, F. (2016). Suffix Trees and Arrays. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_627

Download citation

Publish with us

Policies and ethics