Suffix Tree Construction in Hierarchical Memory

Ferragina, Paolo

doi:10.1007/978-1-4939-2864-4_413

Paolo Ferragina²

161 Accesses

Years and Authors of Summarized Original Work

2000; Farach-Colton, Ferragina, Muthukrishnan

Problem Definition

The suffix tree is the ubiquitous data structure of combinatorial pattern matching myriad of situations – just to cite a few, searching, data compression and mining, and bioinformatics [7]. In these applications, the large data sets now available involve the use of numerous memory levels which constitute the storage medium of modern PCs: L1 and L2 caches, internal memory, multiple disks, and remote hosts over a network. The power of this memory organization is that it may be able to offer the expected access time of the fastest level (i.e., cache) while keeping the average cost per memory cell near the one of the cheapest level (i.e., disk), provided that data are properly cached and deliveredto the requiring algorithms. Neglecting questions pertaining to the cost of memory references may even prevent the use of algorithms on large sets of input data. Engineering research is...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 1,999.99; Price excludes VAT (USA)

Hardcover Book: USD 1,999.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Bedathur SJ, Haritsa JR (2004) Engineering a fast online persistent suffix tree construction. In: Proceedings of the 20th international conference on data engineering, Boston, pp 720–731
Google Scholar
Cheung C, Yu J, Lu H (2005) Constructing suffix tree for gigabyte sequences with megabyte memory. IEEE Trans Knowl Data Eng 17:90–105
Article Google Scholar
Farach-Colton M, Ferragina P, Muthukrishnan S (2000) On the sorting-complexity of suffix tree construction. J ACM 47:987–1011
Article MathSciNet MATH Google Scholar
Ferragina P (2005) Handbook of computational molecular biology. In: Computer and information science series, ch. 35 on “String search in external memory: algorithms and data structures”. Chapman & Hall/CRC, Florida
Google Scholar
Ferragina P, Grossi R (1999) The string Btree: a new data structure for string search in external memory and its applications. J ACM 46:236–280
Article MathSciNet MATH Google Scholar
Ferragina P, Gagie T, Manzini G (2012) Lightweight data indexing and compression in external memory. Algorithmica 63(3):707–730
Article MathSciNet MATH Google Scholar
Gusfield D (1997) Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge
Book MATH Google Scholar
Hon W, Sadakane K, Sung W (2009) Breaking a time-and-space barrier in constructing full-text indices. SIAM J Comput 38(6):2162–2178
Article MathSciNet MATH Google Scholar
Hunt E, Atkinson M, Irving R (2002) Database indexing for large DNA and protein sequence collections. Int J Very Large Data Bases 11:256–271
Article MATH Google Scholar
Kärkkäinen J, Sanders P, Burkhardt S (2006) Linear work suffix array construction. J ACM 53:918– 936
Article MathSciNet MATH Google Scholar
Ko P, Aluru S (2007) Optimal self-adjusting trees for dynamic string data in secondary storage. In: Symposium on string processing and information retrieval (SPIRE), Santiago. LNCS, vol 4726, pp 184–194. Springer, Berlin
Google Scholar
Mäkinen V, Navarro G (2008) Dynamic entropy-compressed sequences and full-text indexes. ACM Trans Algorithm 4(3)
Google Scholar
Manber U, Myers G (1993) Suffix arrays: a new method for on-line string searches. SIAM J Comput 22:935–948
Article MathSciNet MATH Google Scholar
Mansour E, Allam A, Skiadopoulos S, Kalnis P (2011) ERA: efficient serial and parallel suffix tree construction for very long strings. PVLDB 5(1):49–60
Google Scholar
Navarro G, Baeza-Yates R (2000) A hybrid indexing method for approximate string matching. J Discr Algorithms 1:21–49
MathSciNet Google Scholar
Navarro G, Mäkinen V (2007) Compressed full text indexes. ACM Comput Surv 39(1): Article no 2
Google Scholar
Tian Y, Tata S, Hankins RA, Patel JM (2005) Practical methods for constructing suffix trees. VLDB J 14(3):281–299
Article Google Scholar
Thomo A, Barsky M, Stege U (2010) A survey of practical algorithms for suffix tree construction in external memory. Softw Pract Experience 40(11):965–988
Article Google Scholar
Tsirogiannis D, Koudas N (2010) Suffix tree construction on modern hardware. In: Proceedings of the 13th international conference on extending database technology (EDBT), Lausanne, pp 263–274
Google Scholar
Vitter J (2002) External memory algorithms and data structures: dealing with MASSIVE DATA. ACM Comput Surv 33:209–271
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Pisa, Largo Bruno Pontecorvo 3, I-56127, Pisa, Italy
Paolo Ferragina

Authors

Paolo Ferragina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paolo Ferragina .

Editor information

Editors and Affiliations

Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL, USA
Ming-Yang Kao

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Ferragina, P. (2016). Suffix Tree Construction in Hierarchical Memory. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_413

Download citation

DOI: https://doi.org/10.1007/978-1-4939-2864-4_413
Published: 22 April 2016
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics