skip to main content
10.1145/2063576.2064015acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
demonstration

Scalable similarity search of timeseries with variable dimensionality

Published:24 October 2011Publication History

ABSTRACT

Timeseries can be similar in shape but differ in length. For example, the sound waves produced by the same word spoken twice have roughly the same shape, but one may be shorter in duration. Stream data mining, approximate querying of image and video databases, data compression, and near duplicate detection are applications that need to be able to classify or cluster such timeseries, and to search for and rank timeseries that are similar to a chosen timeseries. We demonstrate software for clustering and performing similarity search in databases of timeseries data, where the timeseries have high and variable dimensionality. Our demonstration uses Timeseries Sensitive Hashing (TSH)[3] to index the timeseries. TSH adapts Locality Sensitive Hashing (LSH), which is an approximate algorithm to index data points in a d-dimensional space under some (e.g., Euclidean) distance function. TSH, unlike LSH, can index points that do not have the same dimensionality. As examples of the potential of TSH, the demonstration will index and classify timeseries from an image database and timeseries describing human motion extracted from a video stream and a motion capture system.

References

  1. I. Assent, R. Krieger, F. Afschari, and T. Seidl. The TS-tree: efficient time series search and retrieval. In 11th Conference on Extending Database Technology (EDBT), pages 252--263, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. BerkleyDB. http://www.oracle.com/database/berkeleydb/, 2011.Google ScholarGoogle Scholar
  3. O. U. Florez, A. Ocsa, and C. E. Dyreson. Sublinear querying of realistic timeseries and its application to human motion. In Multimedia Information Retrieval, pages 137--146, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. W.-c. Fu, E. Keogh, L. Y. H. Lau, and C. A. Ratanamahatana. Scaling and time warping in time series querying. In VLDB'05: Proceedings of the 31st international conference on Very Large Data Bases, pages 649--660. VLDB Endowment, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD Conference, pages 47--57, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Keogh and C. A. Ratanamahatana. Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3):358--386, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. Koga, T. Ishibashi, and T. Watanabe. Fast agglomerative hierarchical clustering algorithm using locality-sensitive hashing. Knowledge and Information Systems, 12:25--53, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Shieh and E. Keogh. iSAX: indexing and mining terabyte sized time series. In KDD'08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 623--631, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y.-P. Wu, J.-J. Guo, and X.-J. Zhang. A linear DBScan algorithm based on LSH. 6th Conference on Machine Learning and Cybernetics,, August 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Scalable similarity search of timeseries with variable dimensionality

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
      October 2011
      2712 pages
      ISBN:9781450307178
      DOI:10.1145/2063576

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • demonstration

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader