Skip to main content

On the Structure of Consistent Partitions of Substring Set of a Word

  • Conference paper
Frontiers in Algorithmics (FAW 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5598))

Included in the following conference series:

  • 965 Accesses

Abstract

DAWG is a key data structure for string matching and it is widely used in bioinformatics and data compression. But DAWGs are memory greedy. Weighted directed word graph (WDWG) is a space-economical variation of DAWG which is as powerful as DAWG. The underlay concept of WDWGs is a new equivalent relation of the substrings of a word, namely the minimal consistent linear partition. However, the structure of the consistent linear partition is not extensively explored. In this paper, we present a theorem that gives insight into the structure of consistent partitions. Through this theorem, one can enumerate all the consistent linear partitions and verify whether a linear partition is consistent. It also demonstrates how to merge the DAWG into a consistent partition. In the end, we give a simple and easy-to-construct class of consistent partitions based on lexicographic order.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allauzen, C., Crochemore, M., Raffinot, M.: Efficient Experimental String Matching by Weak Factor Recognition. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 51–72. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  2. Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.: The smallest automation recognizing the subwords of a text. Theoretical Computer Science 40, 31–55 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  3. Blumer, A., Blumer, J., Haussler, D., McConnell, R., Ehrenfeucht, A.: Complete inverted files for effcient text retrieval and analysis. Journal of the ACM 34(3), 578–595 (1987)

    Article  MATH  Google Scholar 

  4. Crochemore, M.: Transducers and repetitions. Theoretical Computer Science 45, 63–86 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  5. Crochemore, M., Czumaj, A., Gasieniec, L., Lecroq, T., Plandowski, W., Rytter, W.: Fast Practical Multi-Pattern Matching. Inf. Process. Lett. 71(3-4), 107–113 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  6. Crochemore, M., Ilie, L., Seid-Hilmi, E.: The Structure of Factor Oracles. Int. J. Found. Comput. Sci 18(4), 781–797 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Charras, C., Lecroq, T.: Exact string matching algorithms (1997), http://www-igm.univ-mlv.fr/~lecroq/string/

  8. Gusfield, D.: Algorithms on Strings Trees and Sequences. Cambridge UniversityPress, New York (1997)

    Book  MATH  Google Scholar 

  9. Grossi, R., Vitter, J.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In: Proceedings of the 32nd ACM Symposium on Theory of Computing (2000)

    Google Scholar 

  10. Inenaga, S., Shinohara, A., Takeda, M., Arikawa, S.: Compact Directed Acyclic Word Graphs for a Sliding Window. Journal of Discrete Algorithms 2(1), 33–51 (2004); Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476. Springer, Heidelberg (2002)

    Article  MathSciNet  MATH  Google Scholar 

  11. Miyamoto, S., Inenaga, S., Takeda, M., Shinohara, A.: Ternary Directed Acyclic Word Graphs. Theoretical Compututer Science 328(1-2), 97–111 (2004); H. Ibarra, O., Dang, Z. (eds.) CIAA 2003. LNCS, vol. 2759, pp. 120–130. Springer, Heidelberg (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing 22, 935–948 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  13. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14, 249–260 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  14. Weiner, P.: Linear pattern matching algorithm. In: Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

  15. Zhang, M., Ju, J.: Space-economical reassembly for intrusion detection system. In: Qing, S., Gollmann, D., Zhou, J. (eds.) ICICS 2003. LNCS, vol. 2836, pp. 393–404. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  16. Zhang, M., Tang, J., Guo, D., Hu, L., Li, Q.: Succinct Text Indexes on Large Alphabet. In: Cai, J.-Y., Cooper, S.B., Li, A. (eds.) TAMC 2006. LNCS, vol. 3959, pp. 528–537. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Zhang, M., Hu, L., Li, Q., Ju, J.: Weighted Directed Word Graph. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 156–167. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, M., Zhang, Y., Hu, L., Xin, P. (2009). On the Structure of Consistent Partitions of Substring Set of a Word. In: Deng, X., Hopcroft, J.E., Xue, J. (eds) Frontiers in Algorithmics. FAW 2009. Lecture Notes in Computer Science, vol 5598. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02270-8_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02270-8_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02269-2

  • Online ISBN: 978-3-642-02270-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics