Abstract
We present sublinear algorithms — algorithms that use significantly less resources than needed to store or process the entire input stream – for discovering representative trends in data streams in the form of periodicities. Our algorithms involve sampling Õ\((\sqrt{n})\) positions. and thus they scan not the entire data stream but merely a sublinear sample thereof. Alternately, our algorithms may be thought of as working on streaming inputs where each data item is seen once, but we store only a sublinear – Õ\((\sqrt{n})\) – size sample from which we can identify periodicities. In this work we present a variety of definitions of periodicities of a given stream, present sublinear sampling algorithms for discovering them, and prove that the algorithms meet our specifications and guarantees. No previously known results can provide such guarantees for finding any such periodic trends. We also investigate the relationships between these different definitions of periodicity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Batu, T., Ergun, F., Kilian, J., Magen, A., Raskhodnikova, S., Rubinfeld, R., Sami, R.: A sublinear algorithm for weakly approximating edit distance. In: STOC 2003, pp. 316–324 (2003)
Gilbert, A., Guha, S., Indyk, P., Muthukrishnan, S., Strauss, M.: Near-optimal sparse fourier representations via sampling. In: Proc. STOC 2002, pp. 152–161 (2002)
Goldreich, O., Goldwasser, S., Ron, D.: Property testing and its connection to learning and approximation. Journal of the ACM 45(4), 653–750 (1998)
Rubinfeld, R.: Talk on sublinear algorithms, http://external.nj.nec.com/homepages/ronitt/
Rubinfeld, R., Sudan, M.: Robust Characterization of Polynomials with Applications to Program Testing. SIAM Journal of Computing 25(2), 252–271 (1996)
Indyk, P., Koudas, N., Muthukrishnan, S.: Identifying Representative Trends in Massive Time Series Data Sets Using Sketches. In: Proc. VLDB 2000, pp. 363–372 (2000)
Das, G., Gunopoulos, D.: Time Series Similarity Measures, http://www.acm.org/sigs/sigkdd/kdd2000/Tutorial-Das.htm
Kollios, G.: Timeseries Indexing, http://www.cs.bu.edu/faculty/gkollios/ada01/LectNotes/tsindexing.ppt
Olken, F., Rotem, D.: Random sampling from databases: A Survey. Bibliography, at http://pueblo.lbl.gov/olken/mendel/sampling/bibliography.html
Chaudhuri, S., Das, G., Datar, M., Motwani, R., Narasayya, V.: Overcoming Limitations of Sampling for Aggregation Queries. In: Proc. ICDE (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ergun, F., Muthukrishnan, S., Sahinalp, S.C. (2004). Sublinear Methods for Detecting Periodic Trends in Data Streams. In: Farach-Colton, M. (eds) LATIN 2004: Theoretical Informatics. LATIN 2004. Lecture Notes in Computer Science, vol 2976. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24698-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-24698-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21258-4
Online ISBN: 978-3-540-24698-5
eBook Packages: Springer Book Archive