Years and Authors of Summarized Original Work
2009; Hon, Shah, Vitter
2013; Belazzougui, Navarro, Valenzuela
2013; Tsur
2014; Hon, Shah, Thankachan, Vitter
2014; Navarro, Thankachan
Problem Definition
We face the following problem.
Problem 1 (Top-k document retrieval)
Let\(\mathcal{D} =\{ \mathsf{T}_{1},\mathsf{T}_{2},\ldots ,\mathsf{T}_{D}\}\)be a collection of D documents of n characters in total, drawn from an alphabet set Σ = [σ]. The relevance of a documentTdwith respect to a pattern P, denoted by w(P,d) is a function of the set of occurrences of P inTd. Our task is to index\(\mathcal{D}\), such that whenever a pattern P[1,p] and a parameter k comes as a query, the k documents with the highest w(P,⋅) values can be reported efficiently.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Belazzougui D, Navarro G, Valenzuela D (2013) Improved compressed indexes for full-text document retrieval. J Discret Algorithms 18:3–13
Gagie T, Kärkkäinen J, Navarro G, Puglisi SJ (2013) Colored range queries and document retrieval. Theor Comput Sci 483:36–50
Hon WK, Shah R, Vitter JS (2009) Space-efficient framework for top-k string retrieval problems. In: FOCS, Atlanta, pp 713–722
Hon WK, Shah R, Thankachan SV, Vitter JS (2014) Space-efficient frameworks for top-k string retrieval. J ACM 61(2):9
Navarro G (2014) Spaces, trees, and colors: the algorithmic landscape of document retrieval on sequences. ACM Comput Surv 46(4):52
Navarro G, Mäkinen V (2007) Compressed full-text indexes. ACM Comput Surv 39(1):2
Navarro G, Nekrich Y (2012) Top-k document retrieval in optimal time and linear space. In: SODA, Kyoto, pp 1066–1077
Navarro G, Thankachan SV (2014) New space/time tradeoffs for top-k document retrieval on sequences. Theor Comput Sci 542:83–97
Raman R, Raman V, Satti SR (2007) Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans Algorithms 3(4):43
Russo L, Navarro G, Oliveira AL (2011) Fully compressed suffix trees. ACM Trans Algorithms 7(4):53
Shah R, Sheng C, Thankachan SV, Vitter JS (2013) Top-k document retrieval in external memory. In: ESA, Sophia Antipolis, pp 803–814
Tsur D (2013) Top-k document retrieval in optimal space. Inf Process Lett 113(12):440–443
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Thankachan, S.V. (2016). Compressed Document Retrieval on String Collections. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_644
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2864-4_644
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering