Skip to main content

A SOM Variant Based on the Wilcoxon Test for Document Organization and Retrieval

  • Conference paper
  • First Online:
Artificial Neural Networks — ICANN 2002 (ICANN 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2415))

Included in the following conference series:

  • 74 Accesses

Abstract

A variant of the self-organizing maps algorithm is proposed in this paper for document organization and retrieval. Bigrams are used to encode the available documents and signed ranks are assigned to these bigrams according to their frequencies. A novel metric which is based on the Wilcoxon signed-rank test exploits these ranks in assessing the contextual similarity between documents. This metric replaces the Euclidean distance employed by the self-organizing maps algorithm in identifying the winner neuron. Experiments performed using both algorithms demonstrates a superior performance of the proposed variant against the self-organizing map algorithm regarding the average recall-precision curves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. B. Yates and B. R. Neto, Modern Information Retrieval, ACM Press, 1999.

    Google Scholar 

  2. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing, Cambridge, MA: MIT Press, 1999.

    MATH  Google Scholar 

  3. T. Kohonen, Self Organizing Maps, Germany: Springer-Verlag, 1997.

    MATH  Google Scholar 

  4. T. Kohonen, S. Kaski, K. Lagus, J. Salojärvi, V. Paatero, and A. Saarela, “Organization of a massive document collection,” IEEE Trans. on Neural Networks, vol. 11, no. 3, pp. 574–585, May 2000.

    Article  Google Scholar 

  5. J. Astola, P. Haavisto, and Y. Neuro, “Vector median filters,” Proceedings of the IEEE, vol. 78, no. 4, pp. 678–689, April 1990.

    Article  Google Scholar 

  6. D. D. Lewis, “Reuters-21578 text categorization test collection, distribution 1.0,” Sep. 1997, http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html.

  7. R. R. Korfhage, Information Storage and Retrieval, New York: J. Wiley, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Georgakis, A., Kotropoulos, C., Pitas, I. (2002). A SOM Variant Based on the Wilcoxon Test for Document Organization and Retrieval. In: Dorronsoro, J.R. (eds) Artificial Neural Networks — ICANN 2002. ICANN 2002. Lecture Notes in Computer Science, vol 2415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46084-5_161

Download citation

  • DOI: https://doi.org/10.1007/3-540-46084-5_161

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44074-1

  • Online ISBN: 978-3-540-46084-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics