Abstract
The paper demonstrates that the predictive capabilities of a typical kernel machine on the training set can be a reliable indicator of its performance on the independent test set in the region where scores are larger than 1 in magnitude. We present initial results of a number of experiments on the popular Reuters newswire benchmark and the NIST handwritten digit recognition data set. In particular, we demonstrate that the values of recall and precision estimated from the training and independent test sets are within a few percent of each other for the evaluated benchmarks. Interestingly, this holds for both separable and non-separable data cases, and for training sample sizes an order of magnitude smaller than the dimensionality of the feature space used (e.g. using ≈2000 samples versus ≈20000 features for Reuters data). A theoretical explanation of the observed phenomena is also presented.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
C. Cortes and V. Vapnik. Support vector networks. Machine Learning, 20:273–297, 1995.
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge Uni. Press, Cambridge, 2000.
S. Dumais, J. Platt, Heckerman D., and M. Sahami. Inductive learning algorithms and representations for text categorization. In Seventh International Conference on Information and Knowledge Management, 1998.
F. Girosi, M. Jones, and T. Poggio. Regularization theory and neural networks architectures. Neural Computation, 7(2):219–269, 1995.
T. Jaakkola and D. Haussler. Estimating the Generalization Performance of an SVM Efficiently. In Proc. of the Seventh International Conference on Machine Learning, San Francisco, 1999. Morgan Kaufman.
T. Joachims. Estimating the Generalization Performance of an SVM Efficiently. In P. Langley, editor, Seventh Int. Conf. on Machine Learning, pages 431–438, Morgan Kaufman, 2000.
M. Kerns. A bound on error of cross validation using the approximation and estimation rates, with consequences for the training-test split. Neural Computation, 9:1143–1162, 1997.
G. Kimeldorf and G. Wahba. A correspondence between Bayesian estimation of stochastic processes and smoothing by splines. Ann. Math. Statist., 41:495–502, 1970.
A. Kowalczyk. Maximal margin perceptron. In P. Bartlett, B. Schölkopf, D. Schuurmans, and A. Smola, editors, Advances in Large Margin Classifiers, pages 61–100, Cambridge, MA, 2000. MIT Press.
A. Kowalczyk. Sparsity of data representation of optimal kernel machine and leave-one-out estimator. In T.G. Dietterich T.K. Leen and V. Tresp, editors, Advances in Neural Information Processing Systems 13, Cambridge, MA, 2001. MIT Press.
M. Opper and O. Winther. Gaussian process classification and svm: Mean field results and leave-one out estimator. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 301–316, Cambridge, MA, 2000. MIT Press.
B. Raskutti, H. Ferrá, and A. Kowalczyk. Second Order Features for Maximising Text Classification Performance. In Proceedings of the Twelfth European Conference on Machine Learning ECML01, 2001.
V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, New York, 1995.
V. Vapnik. Statistical Learning Theory. Wiley, New York, 1998.
V. Vapnik and O. Chapelle. Bounds on error expectation for svm. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 261–280, Cambridge, MA, 2000. MIT Press.
S. M. Weiss, C. Apte, F. Damerau, D.E. Johnson, F. J. Oles, T. Goetz, and T. Hampp. Maximizing text-mining performance. IEEE Intelligent Systems, 14, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kowalczyk, A., Raskutti, B. (2001). Learner’s self-assessment: a case study of SVM for information retrieval. In: Stumptner, M., Corbett, D., Brooks, M. (eds) AI 2001: Advances in Artificial Intelligence. AI 2001. Lecture Notes in Computer Science(), vol 2256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45656-2_22
Download citation
DOI: https://doi.org/10.1007/3-540-45656-2_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42960-9
Online ISBN: 978-3-540-45656-8
eBook Packages: Springer Book Archive