Abstract
Web search engines, such as AltaVista and Infoseek, handle tremendous loads by exploiting the parallelism implicit in their tasks and using symmetric multiprocessors to support their services. The web searching problem that they solve is a special case of the more general information retrieval (IR) problem of locating documents relevant to the information need of users. In this paper, we investigate how to exploit a symmetric multiprocessor to build high performance IR servers. Although the problem can be solved by throwing lots of CPU and disk resources at it, the important questions are how much of which hardware and what software structure is needed to effectively exploit hardware resources. We have found, to our surprise, that in some cases adding hardware degrades performance rather than improves it. We show that multiple threads are needed to fully utilize hardware resources. Our investigation is based on InQuery, a state-of-the-art full-text information retrieval engine.
Chapter PDF
References
B. Cahoon and K. S. McKinley. Performance evaluation of a distributed architecture for information retrieval. In Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 110–118, Zurich, Switzerland, August 1996.
J. P. Callan, W. B. Croft, and J. Broglio. TREC and TIPSTER experiments with INQUERY. Information Processing & Management, 31(3):327–343, 1995.
J. P. Callan, W. B. Croft, and S. M. Harding. The INQUERY retrieval system. In Proceedings of the 3rd International Conference on Database and Expert System Applications, Valencia, Spain, September 1992.
W. B. Croft, R. Cook, and D. Wilder. Providing government information on the internet: Experiences with THOMAS. In The Second International Conference on the Theory and Practice of Digital Libraries, Austin, TX, June 1995.
InQuery. http://ciir.cs.umass.edu/info/highlights.html.
Zhihong Lu, Kathryn S. McKinley, and Brendon Cahoon. The hardware/software balancing act for information retrieval on symmetric multiprocessors. Technical Report TR98-25, University of Massachusetts, Amherst, 1998.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kathryn, Z.L., McKinley, S., Cahoon, B. (1998). The hardware/software balancing act for information retrieval on symmetric multiprocessors. In: Pritchard, D., Reeve, J. (eds) Euro-Par’98 Parallel Processing. Euro-Par 1998. Lecture Notes in Computer Science, vol 1470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0057896
Download citation
DOI: https://doi.org/10.1007/BFb0057896
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64952-6
Online ISBN: 978-3-540-49920-6
eBook Packages: Springer Book Archive