Efficient processing of top-k queries: selective NRA algorithms

Yuan, Jing; Sun, Guangzhong; Luo, Tao; Lian, Defu; Chen, Guoliang

doi:10.1007/s10844-012-0208-5

Efficient processing of top-k queries: selective NRA algorithms

Published: 27 May 2012

Volume 39, pages 687–710, (2012)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Jing Yuan¹,
Guangzhong Sun¹,
Tao Luo¹,
Defu Lian¹ &
…
Guoliang Chen¹

347 Accesses
2 Citations
Explore all metrics

Abstract

Efficient processing of top-k queries has drawn increasing attention from both industry and academia due to its varied applications. Lower access cost is a crucial concern for a top-k query processing. Typically, when answering a top-k query, there exist two types of accesses: sorted access and random access. In some scenarios, the latter is not supported by the data source. Fagin et al. proposed the No Random Access (NRA) algorithm (Fagin et al, J Comput Syst Sci 66:614–656, 2003) for this situation. In this paper, we motivate our work by a key observation of the NRA algorithm: the number of accesses could be further reduced by selectively (instead of in parallel) performing sorted accesses to different lists of the dataset. Based on this insight, we propose a Selective NRA (SNRA) algorithm aiming to cut down the unnecessary access cost. Later, we optimize the SNRA algorithm in terms of runtime cost and present the SNRA-opt algorithm. Furthermore, we address the problem of instance optimality theoretically and turn SNRA (and SNRA-opt) into instance optimal algorithms, termed as Hybrid-SNRA (HSNRA) and HSNRA-opt. Extensive experimental results show that our algorithms perform significantly fewer sorted accesses than NRA (and its state-of-the-art variations). In terms of runtime cost, the proposed SNRA-opt and HSNRA-opt algorithms are two orders of magnitude faster than the NRA algorithm. In addition, we discuss the parameter selection problem of the SNRA algorithms, both theoretically and experimentally.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Rashmin Gajera, Suresh Patel, … Ayush Solanki

Stratified random sampling from streaming and stored data

Article 23 October 2020

Trong Duc Nguyen, Ming-Hung Shih, … Bojian Xu

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Article Open access 05 June 2020

Antonios Makris, Konstantinos Tserpes, … Dimosthenis Anagnostopoulos

Notes

http://kdd.ics.uci.edu
http://www.dianping.com
c = 1, p < ∞ → HSNRA; c = 1, p = ∞ → SNRA; c > 1, p < ∞ → HSNRA-opt; c > 1, p < ∞ → SNRA-opt.

References

Akbarinia, R., Pacitti, E., & Valduriez, P. (2007). Best position algorithms for top-k queries. In Proceedings of the 33rd international conference on very large data bases, VLDB ’07 (pp. 495–506).
Balke, W., Güntzer, U., & Kießling, W. (2010). On real-time top k querying for mobile services. On the Move to Meaningful Internet Systems 2002: CoopIS, DOA, and ODBASE pp. 125–143
Bast, H., Majumdar, D., Schenkel, R., Theobald, M., & Weikum, G. (2006). IO-Top-k: Index-access optimized top-k query processing. In Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB ’06 (pp. 475–486).
Fagin, R. (1999). Combining fuzzy information from multiple systems. Journal of Computer and System Sciences, 58, 83–99.
Article MathSciNet MATH Google Scholar
Fagin, R. (2002). Combining fuzzy information: An overview. ACM SIGMOD Record, 31(2), 109–118.
Article Google Scholar
Fagin, R., Lotem, A., & Naor, M. (2003). Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences, 66, 614–656.
Article MathSciNet MATH Google Scholar
Getoor, L., & Diehl, C. (2005). Link mining: A survey. ACM SIGKDD Explorations Newsletter, 7(2), 12.
Google Scholar
Güntzer, U., Balke, W., & Kie, W. (2001). Towards efficient multi-feature queries in heterogeneous environments. In Proceedings of the IEEE international conference on information technology: Coding and computing (pp. 622–628).
Gurský, P., & Vojtáš, P. (2008). Speeding up the nra algorithm. In Proceedings of the 2nd international conference on scalable uncertainty management, SUM ’08 (pp. 243–255).
Hwang, S., & Chang, K. (2007). Optimizing top-k queries for middleware access: A unified cost-based approach. ACM Transactions on Database Systems (TODS), 32(1), 5.
Article Google Scholar
Long, X., & Suel, T. (2005). Three-level caching for efficient query processing in large web search engines. In Proceedings of the 14th international conference on world wide web, WWW ’05 (pp. 257–266). New York, NY, USA.
Luo, Y., Lin, X., Wang, W., & Zhou, X. (2007). Spark: Top-k keyword query in relational databases. In Proceedings of the 2007 ACM SIGMOD international conference on management of data (pp. 115–126). ACM.
Mamoulis, N., Yiu, M., Cheng, K., & Cheung, D. (2007). Efficient top-k aggregation of ranked inputs. ACM Transactions on Database Systems (TODS), 32(3), 19.
Article Google Scholar
Nepal, S., & Ramakrishna, M. (1999). Query processing issues in image (multimedia) databases. In Proceedings 15th international conference on data engineering (pp. 22–29).
Salton, G. (1989). Automatic text processing: The transformation, analysis, and retrieval of information by computer. Addison-Wesley.
Shmueli-Scheuer, M., Li, C., Mass, Y., Roitman, H., Schenkel, R., & Weikum, G. (2009). Best-effort top-k query processing under budgetary constraints. In IEEE international conference on data engineering (pp. 928–939). IEEE.
Theobald, M., Weikum, G., & Schenkel, R. (2004). Top-k query evaluation with probabilistic guarantees. In Proceedings of the thirtieth international conference on very large data bases-volume 30, VLDB endowment (p. 659).
Wimmers, E., Haas, L., Roth, M., & Braendli, C. (1999). Using Fagin’s algorithm for merging ranked results in multimedia middleware. In Fourth IFCIS international conference on cooperative information systems, citeseer (pp. 267–278).
Xin, D., Han, J., & Chang, K. (2007). Progressive and selective merge: Computing top-k with ad-hoc ranking functions. In Proceedings of the 2007 ACM SIGMOD international conference on management of data (pp. 103–114). ACM.
Yuan, J., Sun, G. Z., Tian, Y., Chen, G., & Liu, Z. (2009). Selective-nra algorithms for top-k queries. In Proceedings of the joint international conferences on advances in data and web management, APWeb/WAIM ’09 (pp. 15–26). Berlin, Heidelberg: Springer-Verlag.
Chapter Google Scholar
Zhu, M., Shi, S., Li, M., & Wen, J. R. (2007). Effective top-k computation in retrieving structured documents with term-proximity support. In Proceedings of the sixteenth ACM conference on conference on information and knowledge management, CIKM ’07 (pp. 771–780).

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under the grant No. 61033009 and No. 60873210. This work is also supported by the Anhui Natural Science Foundation under the grant No. 1208085QF106.

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, Anhui, China
Jing Yuan, Guangzhong Sun, Tao Luo, Defu Lian & Guoliang Chen

Authors

Jing Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Guangzhong Sun
View author publications
You can also search for this author in PubMed Google Scholar
Tao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Defu Lian
View author publications
You can also search for this author in PubMed Google Scholar
Guoliang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Yuan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, J., Sun, G., Luo, T. et al. Efficient processing of top-k queries: selective NRA algorithms. J Intell Inf Syst 39, 687–710 (2012). https://doi.org/10.1007/s10844-012-0208-5

Download citation

Received: 20 December 2011
Revised: 27 April 2012
Accepted: 09 May 2012
Published: 27 May 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10844-012-0208-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient processing of top-k queries: selective NRA algorithms

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Stratified random sampling from streaming and stored data

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient processing of top-k queries: selective NRA algorithms

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Stratified random sampling from streaming and stored data

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation