ABSTRACT
Inverted index serves as a fundamental data structure for efficient search across various applications such as full-text search engine, document analytics and other information retrieval systems. The storage requirement and query load for these structures have been growing at a rapid rate. Thus, an ideal indexing system should maintain a small index size with a low query processing time. Previous works have mainly focused on using CPUs and GPUs to exploit query parallelism while utilizing state-of-the-art compression schemes to fit the index in memory. However, scaling parallelism to maximally utilize memory bandwidth on these architectures is still challenging. In this work, we present IIU, a novel inverted index processing unit, to optimize the query performance while maintaining a low memory overhead for index storage. To this end, we co-design the indexing scheme and hardware accelerator so that the accelerator can process highly compressed inverted index at a high throughput. In addition, IIU provides flexible interconnects between modules to take advantage of both intra- and inter-query parallelism. Our evaluation using a cycle-level simulator demonstrates that IIU provides an average of 13.8\times× query latency reduction and 5.4\times× throughput improvement across different query types, while reducing the average energy consumption by 18.6\times×, compared to Apache Lucene, a production-grade full-text search framework.
- W. B. Croft, D. Metzler, and T. Strohman, "Search engines: Information retrieval in practice.," 2010.Google ScholarDigital Library
- ?Okapi bm25." https://en.wikipedia.org/wiki/Okapi_BM25.Google Scholar
- S. Robertson and H. Zaragoza, "The probabilistic relevance framework: Bm25 and beyond," Foundations and Trends in Information Retrieval, vol. 3, no. 4, pp. 333--389, 2009.Google ScholarDigital Library
- "Apache lucene." https://lucene.apache.org/.Google Scholar
- "Apache solr." https://lucene.apache.org/solr/.Google Scholar
- "Elasticsearch." https://www.elastic.co/.Google Scholar
- "The clueweb12 dataset." https://lemurproject.org/clueweb12/.Google Scholar
- M. Busch, K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. Lin, "Earlybird: Real-time search at twitter," in Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, pp. 1360--1369, 2012.Google Scholar
- "Apache solr wiki." https://cwiki.apache.org/confluence/display/solr/ PublicServers#PublicServers-PublicWebsitesusingSolr.Google Scholar
- "Intel VTune Amplifier." https://software.intel.com/en-us/vtune.Google Scholar
- Y. Liu, J.Wang, and S. Swanson, "Griffin: Uniting cpu and gpu in information retrieval systems for intra-query parallelism," ACM SIGPLAN Notices, vol. 53, no. 1, pp. 327--337, 2018.Google ScholarDigital Library
- S. Ding, J. He, H. Yan, and T. Suel, "Using graphics processors for high performance ir query processing," in Proceedings of the 18th International Conference on World Wide Web, pp. 421--430, 2009.Google Scholar
- D. Wu, F. Zhang, N. Ao, F. Wang, X. Liu, and G. Wang, "A batched gpu algorithm for set intersection," in Proceedings of the 10th International Symposium on Pervasive Systems, Algorithms, and Networks, 2009.Google Scholar
- D. Cutting and J. Pedersen, "Optimizations for dynamic inverted index maintenance," in Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 405--411, 1989.Google Scholar
- S. Vigna, "Quasi-succinct indices," in Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013.Google Scholar
- M. Zukowski, S. Heman, N. Nes, and P. Boncz, "Super-scalar ram-cpu cache compression," in Proceedings of the 22nd International Conference on Data Engineering, pp. 59--59, 2006.Google Scholar
- H. Yan, S. Ding, and T. Suel, "Inverted index compression and query processing with optimized document ordering," in Proceedings of the 18th International Conference on World Wide Web, pp. 401--410, 2009.Google Scholar
- D. Lemire and L. Boytsov, "Decoding billions of integers per second through vectorization," Software: Practice and Experience, vol. 45, no. 1, pp. 1--29, 2015.Google ScholarDigital Library
- J. Wang, C. Lin, R. He, M. Chae, Y. Papakonstantinou, and S. Swanson, "Milc: Inverted list compression in memory," Proceedings of the VLDB Endowment, vol. 10, no. 8, pp. 853--864, 2017.Google ScholarDigital Library
- J. S. Culpepper and A. Moffat, "Efficient set intersection for inverted indexing," ACM Transactions on Information Systems, vol. 29, no. 1, pp. 1--25, 2010.Google ScholarDigital Library
- J. Wang, D. Park, Y.-S. Kee, Y. Papakonstantinou, and S. Swanson, "Ssd in-storage computing for list intersection," in Proceedings of the 12th International Workshop on Data Management on New Hardware, pp. 1--7, 2016.Google Scholar
- S. E. Robertson, S. Walker, M. Beaulieu, and P. Willett, "Okapi at trec- 7: Automatic ad hoc, filtering, vlc and interactive track," Nist Special Publication SP, no. 500, pp. 253--264, 1999.Google Scholar
- F. Silvestri and R. Venturini, "Vsencoding: efficient coding and fast decoding of integer lists via dynamic programming," in Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1219--1228, 2010.Google Scholar
- P. Ferragina, I. Nitto, and R. Venturini, "On optimally partitioning a text to improve its compression," Algorithmica, vol. 61, no. 1, 2011.Google Scholar
- A. L. Buchsbaum, G. S. Fowler, and R. Giancarlo, "Improving table compression with combinatorial optimization," Journal of the ACM, vol. 50, no. 6, pp. 825--851, 2003.Google ScholarDigital Library
- P. Rosenfeld, E. Cooper-Balis, and B. Jacob, "Dramsim2: A cycle accurate memory system simulator," IEEE Computer Architecture Letters, vol. 10, no. 1, pp. 16--19, 2011.Google ScholarDigital Library
- J. Bachrach, H. Vo, B. Richards, Y. Lee, A. Waterman, R. Aviienis, J. Wawrzynek, and K. Asanovi?, ?Chisel: Constructing hardware in a scala embedded language," in Design Automation Conference, 2012.Google Scholar
- "The fastpfor C++ library: Fast integer compression." https://github. com/lemire/FastPFor.Google Scholar
- "Common crawl - ccnews dataset." http://commoncrawl.org/2016/10/ news-dataset-available/.Google Scholar
- M. Petri and A. Moffat, "Compact inverted index storage using generalpurpose compression libraries," Software: Practice and Experience, vol. 48, no. 4, pp. 974--982, 2018.Google ScholarCross Ref
- A. Mallia, M. Siedlaczek, J. Mackenzie, and T. Suel, "PISA: Performant indexes and search for academia," Proceedings of the Open-Source IR Replicability Challenge (OSIRRC) co-located at SIGIR, pp. 50--56, 2019.Google Scholar
- "Text retrieval conference (trec)." https://trec.nist.gov/.Google Scholar
- JEDEC, JEDEC Standard JESD235A: High Bandwidth Memory (HBM) DRAM. JEDEC Solid State Technology Association, 2015.Google Scholar
- L. H. Thiel and H. Heaps, "Program design for retrospective searches on large data bases," Information Storage and Retrieval, vol. 8, no. 1, pp. 1--20, 1972.Google ScholarCross Ref
- G. Ottaviano and R. Venturini, "Partitioned elias-fano indexes," in Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 273--282, 2014.Google Scholar
- G. E. Pibiri and R. Venturini, "Variable-byte encoding is now spaceefficient too," CoRR, vol. abs/1804.10949, 2018.Google Scholar
- D. Lemire, L. Boytsov, and N. Kurz, "Simd compression and the intersection of sorted integers," Software: Practice and Experience, vol. 46, no. 6, pp. 723--749, 2016.Google ScholarDigital Library
- F. Zhang, J. Zhai, X. Shen, O. Mutlu, and W. Chen, "Efficient document analytics on compressed data: Method, challenges, algorithms, insights," Proceedings of the VLDB Endowment, vol. 11, no. 11, 2018.Google Scholar
- I. Rae, A. Halverson, and J. F. Naughton, "In-rdbms inverted indexes revisited," in Proceedings of the IEEE 30th International Conference on Data Engineering, pp. 352--363, 2014.Google Scholar
- S. Shah and A. Shaikh, "Hash based optimization for faster access to inverted index," in Proceedings of the 2016 International Conference on Inventive Computation Technologies, vol. 1, pp. 1--5, 2016.Google Scholar
- J. Zhou, Q. Guo, H. V. Jagadish, L. Krcal, S. Liu,W. Luan, A. K. H. Tung, Y. Yang, and Y. Zheng, "A generic inverted index framework for similarity search on the gpu," in Proceedings of the IEEE 34th International Conference on Data Engineering, pp. 893--904, 2018.Google Scholar
- D. Wang, W. yu, R. J. Stones, J. Ren, G. Wang, X. Liu, and M. Ren, "Efficient gpu-based query processing with pruned list caching in search engines," in Proceedings of the IEEE 23rd International Conference on Parallel and Distributed Systems, pp. 215--224, 2017.Google Scholar
- D. Wu, F. Zhang, N. Ao, G. Wang, X. Liu, and Jing Liu, "Efficient lists intersection by cpu-gpu cooperative computing," in 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, pp. 1--8, 2010.Google Scholar
- N. Ao, F. Zhang, D.Wu, D. S. Stones, G.Wang, X. Liu, J. Liu, and S. Lin, "Efficient parallel lists intersection and index compression algorithms using graphics processing units," Proceedings of the VLDB Endowment, vol. 4, no. 8, pp. 470--481, 2011.Google ScholarDigital Library
- Gunther, Milne, and Narasimhan, "Assessing document relevance with run-time reconfigurable machines," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 10--17, 1996.Google Scholar
- J. Yan, N. Xu, Z. Xia, R. Luo, and F. Hsu, "A compression method for inverted index and its fpga-based decompression solution," in Proceedings of the 2010 International Conference on Field-Programmable Technology, pp. 261--264, 2010.Google Scholar
- J. Yan, Z. Zhao, N. Xu, X. Jin, L. Zhang, and F. Hsu, "Efficient query processing for web search engine with fpgas," in Proceedings of the IEEE 20th International Symposium on Field-Programmable Custom Computing Machines, pp. 97--100, 2012.Google ScholarDigital Library
Index Terms
- IIU: Specialized Architecture for Inverted Index Search
Recommendations
BOSS: bandwidth-optimized search accelerator for storage-class memory
ISCA '21: Proceedings of the 48th Annual International Symposium on Computer ArchitectureSearch is one of the most popular and important web services. The inverted index is the standard data structure adopted by most full-text search engines. Recently, custom hardware accelerators for inverted index search have emerged to demonstrate much ...
Inverted index maintenance strategy for flashSSDs
An inverted index is a core data structure of Information Retrieval systems, especially in search engines. Since the search environments have become more dynamic, many on-line index maintenance strategies have been proposed. Previous strategies were ...
On-line index maintenance using horizontal partitioning
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementIn this paper, we propose a new merge-based index maintenance strategy for Information Retrieval systems. The new model is based on partitioning of the inverted index across the terms in it. We exploit the query log to partition the on-disk inverted ...
Comments