Skip to main content

Indexing of Sequences of Sets for Efficient Exact and Similar Subsequence Matching

  • Conference paper
Computer and Information Sciences - ISCIS 2005 (ISCIS 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3733))

Included in the following conference series:

Abstract

Object-relational database management systems allow users to define complex data types, such as objects, collections, and nested tables. Unfortunately, most commercially available database systems do not support either efficient querying or indexing of complex attributes. Different indexing schemes for complex data types have been proposed in the literature so far, most of them being application-oriented proposals. The lack of a single universal indexing technique for attributes containing sets and sequences of values significantly hinders practical usability of these data types in user applications. In this paper we present a novel indexing technique for sequence-valued attributes. Our index permits to index not only sequences of values, but sequences of sets of values as well. Experimental evaluation of the index proves the feasibility and benefit of the index in exact and similar matching of subsequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of the 2001 ACM SIGMOD international conference on Management of data, pp. 151–162. ACM Press, New York (2001)

    Chapter  Google Scholar 

  2. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary lp norms. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 385–394. Morgan Kaufmann Publishers Inc., San Francisco (2000)

    Google Scholar 

  3. Keogh, E., Lonardi, S., Ratanamahatana, C.A.: Towards parameter-free data mining. In: KDD 2004: Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 206–215. ACM Press, New York (2004)

    Chapter  Google Scholar 

  4. Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing multi-dimensional time-series with support for multiple distance measures. In: ACM KDD (2003)

    Google Scholar 

  5. Agrawal, R., Faloutsos, C., Swami, A.N.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)

    Google Scholar 

  6. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM SIGMOD international conference on Management of data, pp. 419–429. ACM Press, New York (1994)

    Chapter  Google Scholar 

  7. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademia Nauk SSSR 163, 845–848 (1965)

    MathSciNet  Google Scholar 

  8. McCreight, E.M.: A space-economical suffix tree construction algorithm. J. ACM 23, 262–272 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  9. Ukkonen, E.: Constructing suffix trees on-line in linear time. In: Leeuwen, J.v. (ed.) Information Processing 1992, Proc. IFIP 12th World Computer Congress, vol. 1, pp. 484–492. Elsevier Sci. Publ., Amsterdam (1992)

    Google Scholar 

  10. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14, 249–260 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  11. Weiner, P.: Linear pattern matching algorithms. In: Proceedings 14th IEEE Annual Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

  12. Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. In: Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, pp. 319–327 (1990)

    Google Scholar 

  13. Nanopoulos, A., Manolopoulos, Y., Zakrzewicz, M., Morzy, T.: Indexing web access-logs for pattern queries. In: WIDM 2002: Proceedings of the 4th international workshop on Web information and data management, pp. 63–68. ACM Press, New York (2002)

    Chapter  Google Scholar 

  14. Wang, H., Perng, C.S., Fan, W., Park, S., Yu, P.S.: Indexing weighted-sequences in large databases. In: Proceedings of International Conference on Data Engineering (2003)

    Google Scholar 

  15. Mamoulis, N., Yiu, M.L.: Non-contiguous sequence pattern queries. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 783–800. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Andrzejewski, W., Morzy, T., Morzy, M. (2005). Indexing of Sequences of Sets for Efficient Exact and Similar Subsequence Matching. In: Yolum, p., Güngör, T., Gürgen, F., Özturan, C. (eds) Computer and Information Sciences - ISCIS 2005. ISCIS 2005. Lecture Notes in Computer Science, vol 3733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11569596_88

Download citation

  • DOI: https://doi.org/10.1007/11569596_88

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29414-6

  • Online ISBN: 978-3-540-32085-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics