Skip to main content

Faster Adaptive Set Intersections for Text Searching

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4007))

Abstract

The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this paper we engineer a better algorithm for this task, which improves over those proposed by Demaine, Munro and López-Ortiz [SODA 2000/ALENEX 2001], by using a variant of interpolation search. More specifically, our contributions are threefold. First, we corroborate and complete the practical study from Demaine et al. on comparison based intersection algorithms. Second, we show that in practice replacing binary search and galloping (one-sided binary) search [4] by interpolation search improves the performance of each main intersection algorithms. Third, we introduce and test variants of interpolation search: this results in an even better intersection algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R.A.: A Fast Set Intersection Algorithm for Sorted Sequences. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 400–408. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  2. Baeza-Yates, R.A., Salinger, A.: Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences. In: Proceedings of 12th International Conference on String Processing and Information Retrieval (SPIRE), pp. 13–24 (2005)

    Google Scholar 

  3. Barbay, J., Kenyon, C.: Adaptive Intersection and t-Threshold Problems. In: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 390–399 (2002)

    Google Scholar 

  4. Bentley, J.L., Yao, A.C.-C.: An almost optimal algorithm for unbounded searching. Information Processing Letters 5(3), 82–87 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  5. Blandford, D.K., Blelloch, G.E.: Compact Representations of Ordered Sets. In: Daniel, K. (ed.) ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 11–19 (2004)

    Google Scholar 

  6. Erik D. Demaine, Thouis R. Jones, Mihai Patrascu. Interpolation search for non-independent data. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 529–530, 2004.

    Google Scholar 

  7. Demaine, E.D., López-Ortiz, A., Munro, J.I.: Adaptive set intersections, unions, and differences. In: Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 743–752 (2000)

    Google Scholar 

  8. Demaine, E.D., López-Ortiz, A., Munro, J.I.: Experiments on Adaptive set intersections for text retrieval systems. In: Buchsbaum, A.L., Snoeyink, J. (eds.) ALENEX 2001. LNCS, vol. 2153, pp. 91–104. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Estivill-Castro, V., Wood, D.: A survey of adaptive sorting algorithms. ACM Computing Surveys 24(4), 441–476 (1992)

    Article  Google Scholar 

  10. Frakes, W., Baeza-Yates, R.: Information Retrieval. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  11. Gonnet, G., Rogers, L., George, G.: An algorithmic and complexity analysis of interpolation search. Acta Informatica 13(1), 39–52 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  12. Hwang, F.K., Lin, S.: Optimal Merging of 2 Elements with n Elements. Acta Informatica 1, 145–158 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  13. Hwang, F.K., Lin, S.: A Simple Algorithm for Merging Two Disjoint Linearly-Ordered Sets. SIAM Journal of Computing 1, 31–39 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  14. Hwang, F.K.: Optimal Merging of 3 Elements with n Elements. SIAM Journal of Computing 9, 298–320 (1980)

    Article  MATH  Google Scholar 

  15. Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. In: Proceedings of the 1st Symposium on Discrete Algorithms (SODA), pp. 319–327 (1990)

    Google Scholar 

  16. Perl, Y., Itai, A., Avni, H.: Interpolation search–A loglogn search. CACM 21(7), 550–554 (1978)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barbay, J., López-Ortiz, A., Lu, T. (2006). Faster Adaptive Set Intersections for Text Searching. In: Àlvarez, C., Serna, M. (eds) Experimental Algorithms. WEA 2006. Lecture Notes in Computer Science, vol 4007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11764298_13

Download citation

  • DOI: https://doi.org/10.1007/11764298_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34597-8

  • Online ISBN: 978-3-540-34598-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics