Skip to main content

Retrieval of Short Documents from Discussion Forums

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2338))

Abstract

The prevalence of short and ill-written documents today has bought into question the effectiveness of various modern retrieval systems. We evaluated three retrieval systems, LSI, Keyphind and a Google simulator. The results showed that LSI performed better than Keyphind or the Google simulator. On the other hand, recall-precision graphs revealed that at low recall levels performance of the Google simulator was higher than those of LSI and Keyphind. When retrieval was weighted favouring more highly relevant documents the Google approach was favourable.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brin, S. & Page, L. 1998. The anatomy of a large scale hyper-textual web search engine. Proceedings of the 7th World Wide Web Conference, Brisbane, Australia.

    Google Scholar 

  2. Bowes, J. 2001. Knowledge management through data mining in discussion forums. M.Sc. dissertation, University of Saskatchewan, Saskatoon.

    Google Scholar 

  3. Gutwin, C, Paynter, G. Witten, I., Nevill-Manning, C. & Frank, E. 1999. Improving browsing in digital libraries with keyphrase Indexes. Technical Report 98-1, Department of Computer Science, Univ. of Sask.

    Google Scholar 

  4. Shaw, W.M., Burgin, Jr.R. & Howell, P. 1997. Performance standards and evaluations in IR test collections: Cluster-based-retrieval methods. Information Processing and Management, 33, 1–14.

    Article  Google Scholar 

  5. Siegel, S. 1956. Nonparametric statistics for the behavioral science. McGraw-Hill Book Company, New York.

    Google Scholar 

  6. Telcordia Technologies 2001. Telcordia’s Latent Semantic Indexing Software (LSI), http://lsi.research.telcordia.com/lsi/papers/execsum.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, F., Greer, J. (2002). Retrieval of Short Documents from Discussion Forums. In: Cohen, R., Spencer, B. (eds) Advances in Artificial Intelligence. Canadian AI 2002. Lecture Notes in Computer Science(), vol 2338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47922-8_30

Download citation

  • DOI: https://doi.org/10.1007/3-540-47922-8_30

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43724-6

  • Online ISBN: 978-3-540-47922-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics