Abstract
Efficient retrieval of the most relevant (i.e. top-k) tuples is an important requirement in information management systems which access large amounts of data. In general answering a top-k query request means to retrieve the k objects which score best for an objective function. We propose some improvements to the best position algorithm (BPA-2) [2]. To the best of our knowledge BPA-2 is currently the fastest available top-k query answering approach based on the widely known and applied Threshold Algorithm (short TA) of Fagin et al. [5]. Our proposed improvements lead to significantly reduced time and memory consumption and better scalability compared to BPA-2: (1) we dynamically create value rather than object based index structures out of the query restrictions posed by the user, (2) we introduce look-ahead techniques to process those index structures. While BPA-2 processes all pre-calculated indexes in parallel we always examine the most promising indexing structure next. We prototypically implemented our fast top-k query answering (FTA) approach. Our experiments showed an improvement by one to two orders of magnitude over BPA-2.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, S., Chaudhuri, S.: Automated ranking of database query results. In: CIDR, pp. 888–899 (2003)
Akbarinia, R., Pacitti, E., Valduriez, P.: Best position algorithms for top-k queries. In: Proc. of the 33rd Intl. Conf. on VLDBs, VLDB 2007 (2007)
Dabringer, C., Eder, J.: Efficient top-k retrieval for user preference queries. In: Proc. of the 26th ACM Symposium on Applied Computing (2011)
Eder, J., Dabringer, C., Schicho, M., Stark, K.: Information systems for federated biobanks. Trans. on Large Scale Data and Knowledge Centered Systems (2009)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Proc. of the 2001 ACM Symp. on Principles of Database Systems. ACM, New York (2001)
Guntzer, U., Balke, W.-T., Kiessling, W.: Optimizing multi-feature queries for image databases. In: Proc. of the 26th Int. Conf. on VLDBs, pp. 419–428. Morgan Kaufmann, San Francisco (2000)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 1–58 (2008)
Lesot, M., Rifqi, M., Benhadda, H.: Similarity measures for binary and numerical data. Int. J. Knowl. Eng. Soft Data Paradigm. 1, 63–84 (2009)
Marian, A., Bruno, N., Gravano, L.: Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst. 29(2), 319–362 (2004)
Robertson, S.: Understanding inverse document frequency: on theoretical arguments for idf. Journal of Documentation 60, 503–520 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dabringer, C., Eder, J. (2011). Fast Top-K Query Answering. In: Hameurlain, A., Liddle, S.W., Schewe, KD., Zhou, X. (eds) Database and Expert Systems Applications. DEXA 2011. Lecture Notes in Computer Science, vol 6861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23091-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-23091-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23090-5
Online ISBN: 978-3-642-23091-2
eBook Packages: Computer ScienceComputer Science (R0)