Abstract
We propose in this paper an adaptation of the k-Nearest Neighbor (k-NN) algorithm using category specific thresholds in a multiclass environment where a document can belong to more than one class. Our method uses feedback to tune the thresholds and in turn the classification performance over time. The experiments were run on the InFile data, comprising 100,000 English documents and 50 topics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Besancon, R., Chaudiron, S., Mostefa, D., Timimi, I., Choukri, K.: The infile project: a crosslingual filtering systems evaluation campaign. In: ELRA (ed.) Proceedings of LREC 2008, Morocco (May 2008)
Bodinier, V., Qamar, A.M., Gaussier, E.: Working notes for the infile campaign: Online document filtering using 1 nearest neighbor. In: Workshop CLEF 2008, Aarhus, Denmark, September 17-19 (2008)
McCallum, A.K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering (1996)
Yang, Y., Liu, X.: A re-examination of text categorization methods. In: SIGIR 1999, USA, pp. 42–49. ACM Press, New York (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bodinier, V., Qamar, A.M., Gaussier, E. (2009). Online Document Filtering Using Adaptive k-NN. In: Peters, C., et al. Evaluating Systems for Multilingual and Multimodal Information Access. CLEF 2008. Lecture Notes in Computer Science, vol 5706. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04447-2_126
Download citation
DOI: https://doi.org/10.1007/978-3-642-04447-2_126
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04446-5
Online ISBN: 978-3-642-04447-2
eBook Packages: Computer ScienceComputer Science (R0)