Abstract
Data pre-processing always plays a key role in learning algorithm performance. In this research we consider data pre-processing by normalization for Support Vector Machines (SVMs). We examine the normalization affect across 112 classification problems with SVM using the rbf kernel. We observe a significant classification improvement due to normalization. Finally we suggest a rule based method to find when normalization is necessary for a specific classification problem. The best normalization method is also automatically selected by SVM itself.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vapnik, V.: The Nature of The Statistical Learning Theory. Springer, New York (1995)
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)
Vapnik, V.N.: An Overview of Statistical Learning Theory. IEEE Transaction on Neural Networks 10(5), 988–999 (1999)
Graf, A., Borer, S.: Normalization in Support Vector Machines. In: Proc. DAGM Pattern Recognition, Springer, Berlin (2001)
Pontil, M., Verri, A.: Support Vector Machines for 3-D Object Recognition. IEEE Trans. Pattern Anal. Machine Intell. 20, 637–646 (1998)
Graf, A.B.A., Smola, A.J., Borer, S.: Classification in a Normalized Feature Space Using Support Vector Machines. IEEE Transactions on Neural Networks 14(3), 597–605 (2003)
Herbrich, R., Graepel, T.: A PAC-bayesian margin bound for linear classifiers: Why SVM’s work. Advances in Neural Information Processing Systems 13 (2001)
Ali, S., Smith, K.A.: Kernel Width Selection for SVM Classification-A Meta-Learning Approach. International Journal of Data Warehousing and Mining, Idea Publishers, USA, 78–97 (2005)
Blake, C., Merz, C.J.: UCI Repository of Machine Learning Databases. University of California, Irvine, CA (2002), http://www.ics.uci.edu/~mlearn/MLRepository.html
Lim, T.-S.: Knowledge Discovery Central, Datasets (2002), http://www.KDCentral.com/
Kennedy, R.L., Lee, Y., Roy, B.V., Reed, C.D., Lippman, R.P.: Solving Data Mining Problems Through Pattern Recognition. Prentice-Hall, Englewood Cliffs (1997)
Statistics toolbox user’s guide, Version 3, The MathWorks, Inc. USA (2001)
Smith, K.A., Woo, F., Ciesielski, V., Ibrahim, R.: Modelling The Relationship Between Problem Characteristics and Data Mining Algorithm Performance Using Neural Networks. In: Dagli, C., et al. (eds.) Smart Engineering System Design: Neural Networks, Fuzzy Logic, Evolutionary Programming, Data Mining, and Complex Systems, vol. 11, pp. 357–362. ASME Press (2001)
Smith, K.A., Woo, F., Ciesielski, V., Ibrahim, R.: Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organising Map. In: Abraham, A., Koppen, M. (eds.) Hybrid Information Systems, pp. 169–180. Physica-Verlag, Heidelberg (2002)
Mandenhall, W., Sincich, T.: Statistics for Engineering and The Sciences, 4th edn. Prentice-Hall, Englewood Cliffs (1995)
Tamhane, A.C., Dunlop, D.D.: Statistics and Data Analysis. Prentice Hall, Englewood Cliffs (2000)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufman Publishers, San Mateo (1993)
Duin, R.P.W.: A note on comparing classifier. Pattern Recognition Letters 1, 529–536 (1996)
Witten, I.H., Frank, E.: Data Mining: practical machine learning tool and technique with Java implementation. Morgan Kaufmann, San Francisco (2000)
Evans, M., Hastings, N., Peacock, B.: Statistical Distributions, 2nd edn. John Wiley and Sons, Chichester (1993)
Johnson, N., Kotz, S.: Distributions in Statistics: Continuous Univariate Distributions, 2nd edn. John Wiley and Sons, Chichester (1970)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ali, S., Smith-Miles, K.A. (2006). Improved Support Vector Machine Generalization Using Normalized Input Space. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_40
Download citation
DOI: https://doi.org/10.1007/11941439_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)