Abstract
We investigate the following data mining problems from the document retrieval: From a large data set of documents, we need to find documents that relate to human interest as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interest. We apply active learning techniques based on Support Vector Machine for evaluating successive batches, which is called relevance feedback. Our proposed approach has been very useful for document retrieval with relevance feedback experimentally. In this paper, we adopt several representations of the Vector Space Model and several selecting rules of displayed documents at each iteration, and then show the comparison results of the effectiveness for the document retrieval in these several situations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Yates, R.B., Neto, B.R.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
TREC, http://trec.nist.gov/
Salton, G., McGill, J.: Introduction to modern information retrieval. McGraw-Hill, New York (1983)
Salton, G. (ed.): Relevance feedback in information retrieval, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)
Okabe, M., Yamada, S.: Interactive document retrieval with relational learning. In: Proceedings of the 16th ACM Symposium on Applied Computing, pp. 27–31 (2001)
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45–66 (2001)
Drucker, H., Shahrary, B., Gibbon, D.C.: Relevance feedback using support vector machines. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 122–129 (2001)
Onoda, T., Murata, H., Yamada, S.: Interactive document retrieval with active learning. In: International Workshop on Active Mining (AM 2002), Maebashi, Japan, pp. 126–131 (2002)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Bishop, C.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)
Murata, N., Yoshizawa, S., Amari, S.: Network information criterion - determining the number of hidden units for an artificial neural network model. IEEE Transactions on Neural Networks 5, 865–872 (1994)
Onoda, T.: Neural network information criterion for the optimal number of hidden units. In: Proc. ICNN 1995, pp. 275–280 (1995)
Orr, J., Müller, K.R. (eds.): NIPS-WS 1996. LNCS, vol. 1524. Springer, Heidelberg (1998)
Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Haussler, D. (ed.) 5th Annual ACM Workshop on COLT, pp. 144–152. ACM Press, Pittsburgh (1992)
Schölkopf, B., Smola, A., Williamson, R., Bartlett, P.: New support vector algorithms. Neural Computaion 12, 1083–1121 (2000)
Schapire, R., Singer, Y., Singhal, A.: Boosting and rocchio applied to text filtering. In: Proceedings of the Twenty-First Annual International ACM SIGIR, pp. 215–223 (1998)
Kernel-Machines, http://www.kernel-machines.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Onoda, T., Murata, H., Yamada, S. (2005). Relevance Feedback Document Retrieval Using Support Vector Machines. In: Tsumoto, S., Yamaguchi, T., Numao, M., Motoda, H. (eds) Active Mining. Lecture Notes in Computer Science(), vol 3430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11423270_4
Download citation
DOI: https://doi.org/10.1007/11423270_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26157-5
Online ISBN: 978-3-540-31933-7
eBook Packages: Computer ScienceComputer Science (R0)