Years and Authors of Summarized Original Work
-
2009; Hon, Shah, Vitter
-
2013; Belazzougui, Navarro, Valenzuela
-
2013; Tsur
-
2014; Hon, Shah, Thankachan, Vitter
-
2014; Navarro, Thankachan
Problem Definition
We face the following problem.
Problem 1 (Top-k document retrieval)
Let\(\mathcal{D} =\{ \mathsf{T}_{1},\mathsf{T}_{2},\ldots ,\mathsf{T}_{D}\}\)be a collection of D documents of n characters in total, drawn from an alphabet set Σ = [σ]. The relevance of a documentTdwith respect to a pattern P, denoted by w(P,d) is a function of the set of occurrences of P inTd. Our task is to index\(\mathcal{D}\), such that whenever a pattern P[1,p] and a parameter k comes as a query, the k documents with the highest w(P,⋅) values can be reported efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Belazzougui D, Navarro G, Valenzuela D (2013) Improved compressed indexes for full-text document retrieval. J Discret Algorithms 18:3–13
Gagie T, Kärkkäinen J, Navarro G, Puglisi SJ (2013) Colored range queries and document retrieval. Theor Comput Sci 483:36–50
Hon WK, Shah R, Vitter JS (2009) Space-efficient framework for top-k string retrieval problems. In: FOCS, Atlanta, pp 713–722
Hon WK, Shah R, Thankachan SV, Vitter JS (2014) Space-efficient frameworks for top-k string retrieval. J ACM 61(2):9
Navarro G (2014) Spaces, trees, and colors: the algorithmic landscape of document retrieval on sequences. ACM Comput Surv 46(4):52
Navarro G, Mäkinen V (2007) Compressed full-text indexes. ACM Comput Surv 39(1):2
Navarro G, Nekrich Y (2012) Top-k document retrieval in optimal time and linear space. In: SODA, Kyoto, pp 1066–1077
Navarro G, Thankachan SV (2014) New space/time tradeoffs for top-k document retrieval on sequences. Theor Comput Sci 542:83–97
Raman R, Raman V, Satti SR (2007) Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans Algorithms 3(4):43
Russo L, Navarro G, Oliveira AL (2011) Fully compressed suffix trees. ACM Trans Algorithms 7(4):53
Shah R, Sheng C, Thankachan SV, Vitter JS (2013) Top-k document retrieval in external memory. In: ESA, Sophia Antipolis, pp 803–814
Tsur D (2013) Top-k document retrieval in optimal space. Inf Process Lett 113(12):440–443
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Thankachan, S.V. (2016). Compressed Document Retrieval on String Collections. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_644
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2864-4_644
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering