Exhaustive Hybrid Posting Lists Traversing Technique

Jiang, Kun; Yang, Yuexiang

doi:10.1007/978-3-319-23862-3_1

Kun Jiang²¹ &
Yuexiang Yang²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

2802 Accesses

Abstract

A large amount of optimization techniques have been studied in addressing the performance challenges of web search engines, but still leave much room for further improvement. In this paper, we focus on the inverted index traversal techniques, which make directly scans of the posting lists with different loop schemes, providing preliminary results for a complicated ranking procedure. We propose a novel exhaustive index traversal technique called hybrid-scoring at a time (HAAT) on document-ordered indexes, which can reduce memory consumption and candidate selection cost of existing document at a time (DAAT) and term at a time (TAAT) at the expense of revisiting the posting lists of the remaining query terms. Preliminary analysis show comparable computational complexity between HAAT and existing methods. Experimental results with the TREC GOV2 collection show that our approach is comparable with the existing DAAT baseline and considerable performance gains compared to TAAT baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Efficient dynamic pruning on largest scores first (LSF) retrieval

Article 09 January 2016

Optimizing Scoring and Sorting Operations for Faster WAND Processing

The role of index compression in score-at-a-time query evaluation

Article 25 January 2017

References

Dean, J.: Challenges in building large-scale information retrieval systems: invited talk. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 1–1. ACM (2009)
Google Scholar
Turtle, H., Flood, J.: Query evaluation: strategies and optimizations. Inf. Process. Manage. 31(6), 831–850 (1995)
Article Google Scholar
Moffat, A., Zobel, J.: Self-indexing inverted files for fast text retrieval. ACM Trans. Inf. Syst. (TOIS) 14(4), 349–379 (1996)
Article Google Scholar
Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice. Addison-Wesley Reading, Boston (2010)
Google Scholar
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. (CSUR) 38(2), 6 (2006)
Article Google Scholar
Zukowski, M., Heman, S., Nes, N., Boncz, P.: Super-scalar RAM-CPU cache compression. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, pp. 59–59. IEEE (2006)
Google Scholar
Chierichetti, F., Lattanzi, S., Mari, F., Panconesi, A.: On placing skips optimally in expectation. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 15–24. ACM (2008)
Google Scholar
Boldi, P., Vigna, S.: Compressed perfect embedded skip lists for quick inverted-index lookups. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 25–28. Springer, Heidelberg (2005)
Chapter Google Scholar
Lacour, P., Macdonald, C., Ounis, I.: Efficiency comparison of document matching techniques. In: Proceedings of ECIR (2008)
Google Scholar
Büttcher, S., Clarke, C., Cormack, G.V.: Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press, Boston (2010)
MATH Google Scholar
Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 993–1002. ACM (2011)
Google Scholar
Jonassen, S., Bratsberg, S.E.: Efficient compressed inverted index skipping for disjunctive text-queries. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 530–542. Springer, Heidelberg (2011)
Chapter Google Scholar
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: a high performance and scalable information retrieval platform. In: Proceedings of SIGIR OSIR Workshop (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer, National University of Defense Technology, Changsha, 410073, China
Kun Jiang & Yuexiang Yang

Authors

Kun Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yuexiang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Jiang .

Editor information

Editors and Affiliations

Zhejiang University, Hangzhou, China
Xiaofei He
Xidian University, Xi'an, China
Xinbo Gao
Northwestern Polytechnical University, Xi'an, China
Yanning Zhang
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Chinese Academy of Sciences, Beijing, China
Zhi-Yong Liu
Suzhou University of Science and Technology, Suzhou, China
Baochuan Fu
Suzhou University of Science and Technology, Jiangsu, China
Fuyuan Hu
Suzhou University of Science and Technology, Jiangsu, China
Zhancheng Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, K., Yang, Y. (2015). Exhaustive Hybrid Posting Lists Traversing Technique. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-23862-3_1
Published: 17 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23861-6
Online ISBN: 978-3-319-23862-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics