Skip to main content

Parallel DSIR Text Retrieval System

  • Conference paper
  • First Online:
Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI 1999)

Abstract

We present a study concerning the applicability of a distributed computing technique to a million-page free-text document retrieval problem. We propose a high-performance DSIR retrieval algorithm on a Beowulf PC Pentium cluster using PVM message-passing library. DSIR is a vector space based retrieval model in which semantic similarity between documents and queries is characterized by semantic vectors derived from the document collection. Retrieval of relevant answers is then interpreted in terms of computing the geometric proximity between a large number of document vectors and query vectors in a semantic vector space. We test this DSIR parallel algorithm and present the experimental results using a large-scale TREC-7 collection and investigate both computing performance and problem size scalability issue.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Geist et al. PVM: Parallel Virtual Machine-A Users’ Guide and Tutorial for Networked Parallel Computing. The MIT Press, 1994.

    Google Scholar 

  2. J. Dongarra et al. Integrated PVM Framework Supports Heterogeneous Networking Computing. Computers in Physics, 7(2):166–175, April 1993.

    Google Scholar 

  3. T.E. Anderson et al. A Case for NOWs. IEEE Micro, Febuary 1995.

    Google Scholar 

  4. A. Rungsawang. DSIR: The First TREC-7 Attempt. In E. Voorhees and D.K. Harman, editors, Proceedings of the Seventh Text REtrieval Conference. NIST Special publication, November 1988.

    Google Scholar 

  5. A. Rungsawang and M. Rajman. Textual Information Retrieval Based on the Concept of the Distributional Semantics. In Proceedings of the 3 th International Conference on Statistical Analysis of Textual Data, December 1995.

    Google Scholar 

  6. G. Salton and M.J. McGill. Introduction to Modern Information Retrieval. McGraw Hill, 1983.

    Google Scholar 

  7. P. Uthayopas. Beowulf Class Cluster: Opportunities and Approach in Thailand. In First NASA workshop on Beowulf class computer systems. NASA JPL, October 1997.

    Google Scholar 

  8. E.M. Voorhees and D.K. Harman. Overview of the Seventh Text REtrieval Confrence (TREC-7). In Proceedings of the Seventh Text REtrieval Conference. NIST Special publication, November 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rungsawang, A., Tangpong, A., Laohawee, P. (1999). Parallel DSIR Text Retrieval System. In: Dongarra, J., Luque, E., Margalef, T. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 1999. Lecture Notes in Computer Science, vol 1697. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48158-3_40

Download citation

  • DOI: https://doi.org/10.1007/3-540-48158-3_40

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66549-6

  • Online ISBN: 978-3-540-48158-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics