Skip to main content

Periodicity in Streams

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6302))

Abstract

In this work we study sublinear space algorithms for detecting periodicity over data streams. A sequence of length n is said to be periodic if it consists of repetitions of a block of length p for some \(p \leq \frac{n}{2}\). In the first part of this paper, we give a 1-pass randomized streaming algorithm that uses O(log2 n) space and reports the shortest period if the given stream is periodic. At the heart of this result is a 1-pass O(lognlogm) space streaming pattern matching algorithm. This algorithm uses similar ideas to Porat and Porat’s algorithm in FOCS 2009 but it does not need an offline pre-processing stage and is simpler.

In the second part, we study distance to p-periodicity under the Hamming metric, where we estimate the minimum number of character substitutions needed to make a given sequence p-periodic. In streaming terminology, this problem can be described as computing the cascaded aggregate \(L_1\circ F_1^{res(1)}\) over a matrix \(A_{p \times \lfloor\frac{n}{p}\rfloor}\) given in column ordering. For this problem, we present a randomized streaming algorithm with approximation factor 2 + ε that takes \(\tilde{O}(\frac{1}{\epsilon^2})\) space. We also show a 1 + ε randomized streaming algorithm which uses \(\tilde{O}(\frac{1}{\epsilon^{5.5}}p^{1/2})\) space.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, N., Matias, Y., Szegedy, M.: Space complexity of approximating the frequency moments. In: STOC 1996 (1996)

    Google Scholar 

  2. Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. In: SODA 2000 (2000)

    Google Scholar 

  3. Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Sampling algorithms: lower bounds and applications. In: CCC 2002 (2002)

    Google Scholar 

  4. Berinde, R., Cormode, G., Indyk, P., Strauss, M.: Space-optimal heavy hitters with strong error bounds. In: PODS 2009 (2009)

    Google Scholar 

  5. Bhuvanagiri, L., Ganguly, S., Kesh, D., Saha, C.: Simpler algorithm for estimating frequency moments of data streams. In: SODA 2006 (2006)

    Google Scholar 

  6. Bose, P., Kranakis, E., Morin, P., Tang, Y.: Bounds for frequency estimation of packet streams. In: Proceedings of the 10th International Colloquium on Structural Information and Communication Complexity (2003)

    Google Scholar 

  7. Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. Theor. Comput. Sci. 312(1), 3–15 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  8. Cole, R., Hariharan, R.: Approximate String Matching: A Simpler Faster Algorithm. In: SODA 1998 (1998)

    Google Scholar 

  9. Cormode, G., Muthukrishnan, S.: Space efficient mining of multigraph streams. In: PODS 2005, pp. 271–282 (2005)

    Google Scholar 

  10. Czumaj, A., Gąsieniec, L.: On the complexity of determining the period of a string. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 412–422. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  11. Elfeky, M.G., Aref, W.G., Elmagarmid, A.K.: STAGGER: periodicity mining of data streams using expanding sliding windows. In: ICDM 2006 (2006)

    Google Scholar 

  12. Ergun, F., Muthukrishnan, S., Sahinalp, C.: Sublinear methods for detecting periodic trends in data streams. In: Farach-Colton, M. (ed.) LATIN 2004. LNCS, vol. 2976, pp. 16–28. Springer, Heidelberg (2004)

    Google Scholar 

  13. Ganguly, S., Kesh, D., Saha, C.: Practical algorithms for tracking database join sizes. In: Sarukkai, S., Sen, S. (eds.) FSTTCS 2005. LNCS, vol. 3821, pp. 297–309. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM 53(3), 307–323 (2006)

    Article  MathSciNet  Google Scholar 

  15. Indyk, P., Woodruff, D.: Optimal approximations of the frequency moments of data streams. In: STOC 2005 (2005)

    Google Scholar 

  16. Jayram, T.S., Woodruff, D.: The data stream space complexity of cascaded norms. In: FOCS 2009 (2009)

    Google Scholar 

  17. Kane, D.M., Nelson, J., Woodruff, D.: An optimal algorithm for the distinct elements problem. In: PODS 2010 (2010)

    Google Scholar 

  18. Karp, R.M., Rabin, M.O.: Efficient randomized pattern matching algorithms. IBM Journal of Res. and Dev. 249, 260 (1987)

    Google Scholar 

  19. Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comp. 6, 323–350 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  20. Lachish, O., Newman, I.: Testing periodicity. In: Chekuri, C., Jansen, K., Rolim, J.D.P., Trevisan, L. (eds.) APPROX 2005 and RANDOM 2005. LNCS, vol. 3624, pp. 366–377. Springer, Heidelberg (2005)

    Google Scholar 

  21. Lipsky, O., Porat, E.: Improved sketching of hamming distance with error correcting. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 173–182. Springer, Heidelberg (2007)

    Google Scholar 

  22. Misra, J., Gries, D.: Finding repeated elements. Technical Report, Cornell University (1982)

    Google Scholar 

  23. Monemizadeh, M., Woodruff, D.: 1-Pass relative-error Lp-sampling with applications. In: SODA 2010 (2010)

    Google Scholar 

  24. Muthukrishnan, S.: Data stream algorithms. In: The Barbados Workshop on Computational Complexity (2009)

    Google Scholar 

  25. Porat, B., Porat, E.: Exact and approximate pattern matching in the streaming model. In: FOCS 2009 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ergun, F., Jowhari, H., Sağlam, M. (2010). Periodicity in Streams. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. RANDOM APPROX 2010 2010. Lecture Notes in Computer Science, vol 6302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15369-3_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15369-3_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15368-6

  • Online ISBN: 978-3-642-15369-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics