Years and Authors of Summarized Original Work
-
1982; Misra, Gries
Problem Definition
The frequent items problem is to process a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream algorithms, dating back to the 1980s. Many applications rely directly or indirectly on finding the frequent items, and implementations are in use in large-scale industrial systems. Informally, given a sequence of items, the problem is simply to find those items which occur most frequently. Typically, this is formalized as finding all items whose frequency exceeds a specified fraction of the total number of items. Variations arise when the items have weights and further when these weights can also be negative.
Definition 1
Given a stream \(\mathcal{S}\) of n items t1… t n , the frequency of an item i is \(f_{i} = \vert \{j\vert t_{j} = i\}\vert\). The exact ϕ-frequent items comprise the set {i | f i > ϕ n}.
...Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Agarwal P, Cormode G, Huang Z, Phillips J, Wei Z, Yi K (2012) Mergeable summaries. In: ACM principles of database systems, Scottsdale
Berinde R, Cormode G, Indyk P, Strauss M (2009) Space-optimal heavy hitters with strong error bounds. In: ACM principles of database systems, Providence
Bose P, Kranakis E, Morin P, Tang Y (2003) Bounds for frequency estimation of packet streams. In: SIROCCO, Umeå
Boyer B, Moore J (1981) A fast majority vote algorithm. Technical report ICSCA-CMP-32, Institute for Computer Science, University of Texas
Boyer RS, Moore JS (1991) MJRTY – a fast majority vote algorithm. In: Bledsoe WW, Boyer RS (eds) Automated reasoning: essays in honor of Woody Bledsoe. Automated reasoning series. Kluwer Academic, Dordrecht/Boston, pp 105–117
Chakrabarti A, Cormode G, McGregor A (2007) A near-optimal algorithm for computing the entropy of a stream. In: ACM-SIAM symposium on discrete algorithms, New Orleans
Cormode G, Hadjieleftheriou M (2009) Finding the frequent items in streams of data. Commun ACM 52(10):97–105
Demaine E, López-Ortiz A, Munro JI (2002) Frequency estimation of internet packet streams with limited space. In: European symposium on algorithms (ESA), Rome
Fischer M, Salzburg S (1982) Finding a majority among n votes: solution to problem 81-5. J Algorithms 3(4):376–379
Ghashami M, Phillips JM (2014) Relative errors for deterministic low-rank matrix approximations. In: ACM-SIAM symposium on discrete algorithms, Portland, pp 707–717
Karp R, Papadimitriou C, Shenker S (2003) A simple algorithm for finding frequent elements in sets and bags. ACM Trans Database Syst 28: 51–55
Liberty E (2013) Simple and deterministic matrix sketching. In: ACM SIGKDD, Chicago, pp 581–588
Manerikar N, Palpanas T (2009) Frequent items in streaming data: an experimental evaluation of the state-of-the-art. Data Knowl Eng 68(4): 415–430
Manku G, Motwani R (2002) Approximate frequency counts over data streams. In: International conference on very large data bases, Hong Kong, pp 346–357
Metwally A, Agrawal D, Abbadi AE (2005) Efficient computation of frequent and top-k elements in data streams. In: International conference on database theory, Edinburgh
Misra J, Gries D (1982) Finding repeated elements. Sci Comput Program 2:143–152
Moore J (1981) Problem 81-5. J Algorithms 2:208–209
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Cormode, G. (2016). Misra-Gries Summaries. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_572
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2864-4_572
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering