Abstract
This paper shows a comparison of two data structures used for indexing of input texts. The first structure is the Suffix Array and the second is the Directed Acyclic Word Graph (DAWG). We present an eficient DAWG implementation. This implementation is compared with other structures used for text indexing. The construction time and speed of searching of a set of substrings are shown for the DAWG and the Suffix Array.
This research has been partially supported by the Ministry of Education, Youth, and Sports of Czech Republic under research program No J04/98:212300014 and by the Grant Agency of Czech Republic under research program No 102/01/1433.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Anderson A, Nilson S. Efficient implementation of suffix trees. Software-Practice and Experience, 25(1995); 129–141.
Balík M. String Matching in a Text. Diploma Thesis, CTU, Dept. of Computer Science amp; Engineering, Prague, 1998.
Crochemore M, Rytter W. Text Algorithms. Oxford University Press, New York, 1994.
Crochemore M, Vérin R. Direct Construction Of Compact Directed Acyclic Word Graphs. CPM97, A. Apostolico and J. Hein, eds., LNCS 1264, Springer-Verlag, 1997; 116–129.
Gonnet G.H, Baeza-Yates R. Handbook of Algorithms and Data Structures-In Pascal and C. Addison-Wesley, Wokingham, UK, 1991.
Holub J., Melichar B.: Approximate String Matching using Factor Automata. Theoretical Computer Science, Vol. 249 (2), Elsevier Science, 2000, pp. 305–311.
Huffman, D.A. A method for construction of minimum redundancy codes. Proceedings of IRE, Vol.40, No.9, Sept.1952; 1098–1101.
Irving R.W. Suffix binary search trees. Technical report TR-1995-7, Computing science Department, University of Glasgow, Apr.95.
Kärkkäinen J. Suffix cactus: A cross between suffix tree and suffix array. in Proc. 6th Symposium on combinatorial Pattern Matching, CPM95, 1995; 191–204.
Kurtz S. Reducing the Space Requirement of Suffix Trees. Software-Practice and Experience, 29(13), 1999; 1149–1171.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Balík, M. (2003). DAWG versus Suffix Array. In: Champarnaud, JM., Maurel, D. (eds) Implementation and Application of Automata. CIAA 2002. Lecture Notes in Computer Science, vol 2608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44977-9_23
Download citation
DOI: https://doi.org/10.1007/3-540-44977-9_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40391-3
Online ISBN: 978-3-540-44977-5
eBook Packages: Springer Book Archive