skip to main content
10.1145/3373376.3378521acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

IIU: Specialized Architecture for Inverted Index Search

Authors Info & Claims
Published:13 March 2020Publication History

ABSTRACT

Inverted index serves as a fundamental data structure for efficient search across various applications such as full-text search engine, document analytics and other information retrieval systems. The storage requirement and query load for these structures have been growing at a rapid rate. Thus, an ideal indexing system should maintain a small index size with a low query processing time. Previous works have mainly focused on using CPUs and GPUs to exploit query parallelism while utilizing state-of-the-art compression schemes to fit the index in memory. However, scaling parallelism to maximally utilize memory bandwidth on these architectures is still challenging. In this work, we present IIU, a novel inverted index processing unit, to optimize the query performance while maintaining a low memory overhead for index storage. To this end, we co-design the indexing scheme and hardware accelerator so that the accelerator can process highly compressed inverted index at a high throughput. In addition, IIU provides flexible interconnects between modules to take advantage of both intra- and inter-query parallelism. Our evaluation using a cycle-level simulator demonstrates that IIU provides an average of 13.8\times× query latency reduction and 5.4\times× throughput improvement across different query types, while reducing the average energy consumption by 18.6\times×, compared to Apache Lucene, a production-grade full-text search framework.

References

  1. W. B. Croft, D. Metzler, and T. Strohman, "Search engines: Information retrieval in practice.," 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. ?Okapi bm25." https://en.wikipedia.org/wiki/Okapi_BM25.Google ScholarGoogle Scholar
  3. S. Robertson and H. Zaragoza, "The probabilistic relevance framework: Bm25 and beyond," Foundations and Trends in Information Retrieval, vol. 3, no. 4, pp. 333--389, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. "Apache lucene." https://lucene.apache.org/.Google ScholarGoogle Scholar
  5. "Apache solr." https://lucene.apache.org/solr/.Google ScholarGoogle Scholar
  6. "Elasticsearch." https://www.elastic.co/.Google ScholarGoogle Scholar
  7. "The clueweb12 dataset." https://lemurproject.org/clueweb12/.Google ScholarGoogle Scholar
  8. M. Busch, K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. Lin, "Earlybird: Real-time search at twitter," in Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, pp. 1360--1369, 2012.Google ScholarGoogle Scholar
  9. "Apache solr wiki." https://cwiki.apache.org/confluence/display/solr/ PublicServers#PublicServers-PublicWebsitesusingSolr.Google ScholarGoogle Scholar
  10. "Intel VTune Amplifier." https://software.intel.com/en-us/vtune.Google ScholarGoogle Scholar
  11. Y. Liu, J.Wang, and S. Swanson, "Griffin: Uniting cpu and gpu in information retrieval systems for intra-query parallelism," ACM SIGPLAN Notices, vol. 53, no. 1, pp. 327--337, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Ding, J. He, H. Yan, and T. Suel, "Using graphics processors for high performance ir query processing," in Proceedings of the 18th International Conference on World Wide Web, pp. 421--430, 2009.Google ScholarGoogle Scholar
  13. D. Wu, F. Zhang, N. Ao, F. Wang, X. Liu, and G. Wang, "A batched gpu algorithm for set intersection," in Proceedings of the 10th International Symposium on Pervasive Systems, Algorithms, and Networks, 2009.Google ScholarGoogle Scholar
  14. D. Cutting and J. Pedersen, "Optimizations for dynamic inverted index maintenance," in Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 405--411, 1989.Google ScholarGoogle Scholar
  15. S. Vigna, "Quasi-succinct indices," in Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013.Google ScholarGoogle Scholar
  16. M. Zukowski, S. Heman, N. Nes, and P. Boncz, "Super-scalar ram-cpu cache compression," in Proceedings of the 22nd International Conference on Data Engineering, pp. 59--59, 2006.Google ScholarGoogle Scholar
  17. H. Yan, S. Ding, and T. Suel, "Inverted index compression and query processing with optimized document ordering," in Proceedings of the 18th International Conference on World Wide Web, pp. 401--410, 2009.Google ScholarGoogle Scholar
  18. D. Lemire and L. Boytsov, "Decoding billions of integers per second through vectorization," Software: Practice and Experience, vol. 45, no. 1, pp. 1--29, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Wang, C. Lin, R. He, M. Chae, Y. Papakonstantinou, and S. Swanson, "Milc: Inverted list compression in memory," Proceedings of the VLDB Endowment, vol. 10, no. 8, pp. 853--864, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. S. Culpepper and A. Moffat, "Efficient set intersection for inverted indexing," ACM Transactions on Information Systems, vol. 29, no. 1, pp. 1--25, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Wang, D. Park, Y.-S. Kee, Y. Papakonstantinou, and S. Swanson, "Ssd in-storage computing for list intersection," in Proceedings of the 12th International Workshop on Data Management on New Hardware, pp. 1--7, 2016.Google ScholarGoogle Scholar
  22. S. E. Robertson, S. Walker, M. Beaulieu, and P. Willett, "Okapi at trec- 7: Automatic ad hoc, filtering, vlc and interactive track," Nist Special Publication SP, no. 500, pp. 253--264, 1999.Google ScholarGoogle Scholar
  23. F. Silvestri and R. Venturini, "Vsencoding: efficient coding and fast decoding of integer lists via dynamic programming," in Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1219--1228, 2010.Google ScholarGoogle Scholar
  24. P. Ferragina, I. Nitto, and R. Venturini, "On optimally partitioning a text to improve its compression," Algorithmica, vol. 61, no. 1, 2011.Google ScholarGoogle Scholar
  25. A. L. Buchsbaum, G. S. Fowler, and R. Giancarlo, "Improving table compression with combinatorial optimization," Journal of the ACM, vol. 50, no. 6, pp. 825--851, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Rosenfeld, E. Cooper-Balis, and B. Jacob, "Dramsim2: A cycle accurate memory system simulator," IEEE Computer Architecture Letters, vol. 10, no. 1, pp. 16--19, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Bachrach, H. Vo, B. Richards, Y. Lee, A. Waterman, R. Aviienis, J. Wawrzynek, and K. Asanovi?, ?Chisel: Constructing hardware in a scala embedded language," in Design Automation Conference, 2012.Google ScholarGoogle Scholar
  28. "The fastpfor C++ library: Fast integer compression." https://github. com/lemire/FastPFor.Google ScholarGoogle Scholar
  29. "Common crawl - ccnews dataset." http://commoncrawl.org/2016/10/ news-dataset-available/.Google ScholarGoogle Scholar
  30. M. Petri and A. Moffat, "Compact inverted index storage using generalpurpose compression libraries," Software: Practice and Experience, vol. 48, no. 4, pp. 974--982, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  31. A. Mallia, M. Siedlaczek, J. Mackenzie, and T. Suel, "PISA: Performant indexes and search for academia," Proceedings of the Open-Source IR Replicability Challenge (OSIRRC) co-located at SIGIR, pp. 50--56, 2019.Google ScholarGoogle Scholar
  32. "Text retrieval conference (trec)." https://trec.nist.gov/.Google ScholarGoogle Scholar
  33. JEDEC, JEDEC Standard JESD235A: High Bandwidth Memory (HBM) DRAM. JEDEC Solid State Technology Association, 2015.Google ScholarGoogle Scholar
  34. L. H. Thiel and H. Heaps, "Program design for retrospective searches on large data bases," Information Storage and Retrieval, vol. 8, no. 1, pp. 1--20, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  35. G. Ottaviano and R. Venturini, "Partitioned elias-fano indexes," in Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 273--282, 2014.Google ScholarGoogle Scholar
  36. G. E. Pibiri and R. Venturini, "Variable-byte encoding is now spaceefficient too," CoRR, vol. abs/1804.10949, 2018.Google ScholarGoogle Scholar
  37. D. Lemire, L. Boytsov, and N. Kurz, "Simd compression and the intersection of sorted integers," Software: Practice and Experience, vol. 46, no. 6, pp. 723--749, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. F. Zhang, J. Zhai, X. Shen, O. Mutlu, and W. Chen, "Efficient document analytics on compressed data: Method, challenges, algorithms, insights," Proceedings of the VLDB Endowment, vol. 11, no. 11, 2018.Google ScholarGoogle Scholar
  39. I. Rae, A. Halverson, and J. F. Naughton, "In-rdbms inverted indexes revisited," in Proceedings of the IEEE 30th International Conference on Data Engineering, pp. 352--363, 2014.Google ScholarGoogle Scholar
  40. S. Shah and A. Shaikh, "Hash based optimization for faster access to inverted index," in Proceedings of the 2016 International Conference on Inventive Computation Technologies, vol. 1, pp. 1--5, 2016.Google ScholarGoogle Scholar
  41. J. Zhou, Q. Guo, H. V. Jagadish, L. Krcal, S. Liu,W. Luan, A. K. H. Tung, Y. Yang, and Y. Zheng, "A generic inverted index framework for similarity search on the gpu," in Proceedings of the IEEE 34th International Conference on Data Engineering, pp. 893--904, 2018.Google ScholarGoogle Scholar
  42. D. Wang, W. yu, R. J. Stones, J. Ren, G. Wang, X. Liu, and M. Ren, "Efficient gpu-based query processing with pruned list caching in search engines," in Proceedings of the IEEE 23rd International Conference on Parallel and Distributed Systems, pp. 215--224, 2017.Google ScholarGoogle Scholar
  43. D. Wu, F. Zhang, N. Ao, G. Wang, X. Liu, and Jing Liu, "Efficient lists intersection by cpu-gpu cooperative computing," in 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, pp. 1--8, 2010.Google ScholarGoogle Scholar
  44. N. Ao, F. Zhang, D.Wu, D. S. Stones, G.Wang, X. Liu, J. Liu, and S. Lin, "Efficient parallel lists intersection and index compression algorithms using graphics processing units," Proceedings of the VLDB Endowment, vol. 4, no. 8, pp. 470--481, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Gunther, Milne, and Narasimhan, "Assessing document relevance with run-time reconfigurable machines," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 10--17, 1996.Google ScholarGoogle Scholar
  46. J. Yan, N. Xu, Z. Xia, R. Luo, and F. Hsu, "A compression method for inverted index and its fpga-based decompression solution," in Proceedings of the 2010 International Conference on Field-Programmable Technology, pp. 261--264, 2010.Google ScholarGoogle Scholar
  47. J. Yan, Z. Zhao, N. Xu, X. Jin, L. Zhang, and F. Hsu, "Efficient query processing for web search engine with fpgas," in Proceedings of the IEEE 20th International Symposium on Field-Programmable Custom Computing Machines, pp. 97--100, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. IIU: Specialized Architecture for Inverted Index Search

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems
        March 2020
        1412 pages
        ISBN:9781450371025
        DOI:10.1145/3373376

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 March 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate535of2,713submissions,20%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader