Skip to main content

An Efficient Similarity Searching Algorithm Based on Clustering for Time Series

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5077))

Abstract

Indexing large time series databases is crucial for efficient searching of time series queries. In the paper, we propose a novel indexing scheme RQI (Range Query based on Index) which includes three filtering methods: first-k filtering, indexing lower bounding and upper bounding as well as triangle inequality pruning. The basic idea is calculating wavelet coefficient whose first k coefficients are used to form a MBR (minimal bounding rectangle) based on haar wavelet transform for each time series and then using point filtering method; At the same time, lower bounding and upper bounding feature of each time series is calculated, in advance, and stored into index structure. At last, triangle inequality pruning method is used by calculating the distance between time series beforehand. Then we introduce a novel lower bounding distance function SLBS (Symmetrical Lower Bounding based on Segment) and a novel clustering algorithm CSA (Clustering based on Segment Approximation) in order to further improve the search efficiency of point filtering method by keeping a good clustering trait of index structure. Extensive experiments over both synthetic and real datasets show that our technologies provide perfect pruning power and could obtain an order of magnitude performance improvement for time series queries over traditional naive evaluation techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search in Sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–74. Springer, Heidelberg (1993)

    Google Scholar 

  2. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proc. of ACM SIGMOD Conference, pp. 419–429. ACM Press, New York (1994)

    Google Scholar 

  3. Keogh, E.: Exact indexing of dynamic time warping. In: Proc. of ACM VLDB Conference, pp. 406–417. ACM Press, New York (2002)

    Google Scholar 

  4. Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M.: Locally adaptive dimensionality reduction for indexing large time series databases. In: Proc. of ACM SIGMOD Conference, pp. 151–162. ACM Press, New York (2001)

    Google Scholar 

  5. Kim, S., Park, S., Chu, W.W.: An index-based approach for similarity search supporting time warping in large sequence databases. In: Proc. of IEEE ICDE Conference, pp. 607–614. IEEE Press, New York (2001)

    Google Scholar 

  6. Korn, F., Jagadish, H.V., Faloutsos, C.: Supporting ad hoc queries in large datasets of time sequences. In: Proc. ACM SIGMOD Conference, pp. 289–300. ACM Press, New York (1997)

    Chapter  Google Scholar 

  7. Junkui, L., Yuanzhen, W., Xinping, L.: LB_HUST: A symmetrical boundary distance for clustering time series. In: The 9th Int’l Conf. on Information Technology, pp. 203–208. IEEE Press, New York (2006)

    Chapter  Google Scholar 

  8. Liu, B., Zhihui, W., Jingtao, L., Wang, W., Shi, B.: Tight bounds on the estimation distance using wavelet. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 460–471. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: Proc. of IEEE ICDE, pp. 673–684. IEEE Press, New York (2002)

    Google Scholar 

  10. Moon, Y.-S., Loh, W.-K., Whang, K.-Y.: Duality-based subsequence matching in time-series databases. In: Proc. of IEEE ICDE, pp. 263–272. IEEE Press, New York (2001)

    Google Scholar 

  11. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary L p norms. In: Proc. of ACM VLDB Conference, pp. 385–394. ACM Press, New York (2000)

    Google Scholar 

  12. Yi, B.K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proc. of IEEE ICDE Conference, pp. 23–27. IEEE Press, New York (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Feng, Y., Jiang, T., Zhou, Y., Li, J. (2008). An Efficient Similarity Searching Algorithm Based on Clustering for Time Series. In: Perner, P. (eds) Advances in Data Mining. Medical Applications, E-Commerce, Marketing, and Theoretical Aspects. ICDM 2008. Lecture Notes in Computer Science(), vol 5077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70720-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70720-2_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70717-2

  • Online ISBN: 978-3-540-70720-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics