Abstract
In this paper we consider problems related to the sortedness of a data stream. In the first part of this work, we investigate the problem of estimating the distance to monotonicity; given a finite stream of length n from alphabet {1,...,m}, we give a deterministic (2+є)-approximation algorithm for estimating its distance to monotonicity in space \(O\left( {\tfrac{1} {{\varepsilon ^2 }}\log ^2 (\varepsilon n)} \right)\). This improves over the previous randomized (4+є)-approximation algorithm due to Gopalan, Jayram, Krauthgamer and Kumar in SODA 2007.
We then consider the problem of approximating the length of the longest increasing subsequence of the input stream. Through the analysis of a multi-party communication game, we prove that deterministic streaming algorithms that approximate the length of the longest increasing subsequence within 1+є factor require \(\Omega \left( {\sqrt {\tfrac{n} {\varepsilon }} } \right)\) bits of space for any є in the range (1/n, 1/6]. This bound matches the upper bound given in the work of Gopalan et al. within a log factor. We note that, independent of our work and via a different proof strategy, Gál and Gopalan has shown an \(\Omega \left( {\sqrt {\tfrac{n} {\varepsilon }} \log \left( {\tfrac{m} {{\varepsilon n}}} \right)} \right)\) lower bound for all є≥1/n including Ω(1) ranges.
Similar content being viewed by others
References
N. Ailon, B. Chazelle, S. Comandur and D. Liu: Estimating the distance to a monotone function, Random Struct. Algorithms 31 (2007), 371–383.
D. Aldous and P. Diaconis: Longest increasing subsequences: from patience sorting to the baik-deift-johansson theorem, Bull. Amer. Math. Soc. 36 (1999), 413–432.
A. Chakrabarti: A note on randomized streaming space bounds for the longest increasing subsequence problem, Inf. Process. Lett. 112 (2012), 261–263.
A. Gál and P. Gopalan: Lower bounds on streaming algorithms for approximating the length of the longest increasing subsequence, SIAM J. Comput. 39 (2010), 3463–3479.
P. Gopalan, T. S. Jayram, R. Krauthgamer and R. Kumar: Estimating the sortedness of a data stream, SODA (2007), 318–327.
S. Guha and A. McGregor: Tight lower bounds for multi-pass stream computation via pass elimination, ICALP 1 (2008), 760–772.
E. Kushilevitz and N. Nisan: Communication complexity, Cambridge University Press, 1997.
X. Lin, H. Lu, J. Xu and J. X. Yu: Continuously maintaining quantile summaries of the most recent n elements over a data stream, ICDE (2004), 362–373.
D. Liben-Nowell, E. Vee and An Zhu: Finding longest increasing and common subsequences in streaming data, COCOON (2005), 263–272.
M. Saks and C. Seshadhri: Estimating the longest increasing sequence in polylogarithmic time, FOCS (2010), 458–467.
M. Saks and C. Seshadhri: Space efficient streaming algorithms for the distance to monotonicity and asymmetric edit distance, SODA (2013).
X. Sun and D. P. Woodruff: The communication and streaming complexity of computing the longest common and increasing subsequences, SODA (2007), 336–345.
Author information
Authors and Affiliations
Corresponding author
Additional information
A pervious version of this paper by the same authors has appeared in SODA 2008 under the title of “On distance to monotonincity and longest increasing subsequence of a data stream”.
Research supported by NSERC Discovery Grant and PIMS Collaborative Research Grant.
Research was done while the author was a PhD student at School of Computing Science, SFU. Research supported by NSERC Discovery Grant and PIMS Collaborative Research Grant.