Abstract
Margin based feature extraction has become a hot topic in machine learning and pattern recognition. In this paper, we present a novel feature extraction method called Adaptive Margin Maximization (AMM) in which margin is defined to measure the discrimination ability of the features. The motivation comes principally from the iterative weight modification mechanism of the powerful boosting algorithms. In our AMM, the samples are dynamically weighted and features are learned sequentially. After one new feature is learned by maximizing the weighted total margin of data, the weights are updated so that the samples with smaller margins receive larger weights. The feature learned in the next round will thus try to concentrate more on these “hard” samples adaptively. We show that when the data are projected onto the feature space learned by AMM, most examples have large margins, and therefore the nearest neighbor classifier yields small generalization error. This is in contrast to existing margin maximization based feature extraction approaches, in which the goal is to maximize the total margin. Extensive experimental results on benchmark datasets demonstrate the effectiveness of our method.
Similar content being viewed by others
References
Belhumeur, P. N., Hespanha, P. and Kriegman, D. J., “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE Trans. Pattern Anal. Mach. Intell., 19, 7, pp. 711–720, 1997.
Schapire, R. E., Freund, Y., Bartlett, P. and Lee, W. S., “Boosting the margin: a new explanation for the effectiveness of voting methods,” Ann. Stat, 26, 5, pp. 1651–1686, 1998.
Chen, L., Liao, H., Ko, M., Lin, J. and Yu, G., “A New LDA-based Face Recognition System Which Can Solve the Small Sample Size Problem,” Pattern Recognition, 33, 10, pp. 1713–1726, 2000
Duda, R. O., Hart, P. E. and Stork, D. G., Pattern Classification, Wiley, New York, 2001.
Fukunaga, K., Introduction to Statistical Pattern Recognition, Academic Press Professional, Inc., San Diego, CA, 1990.
Freund, Y. and Schapire, R. E., “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” in J. Computer and System Sciences, 55, pp. 119–139, 1997.
Golub, G. H. and Van Loan, C. F., Matrix Computations (Second Edition), Johns Hopkins University Press, 1989.
Howland, P. and Park, H., “Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition,” IEEE Trans. Pattern Anal. Mach. Intell., 26 , 8, pp. 995–1006, 2004.
Li, H., Jiang, T. and Zhang, K., “Efficient and Robust Feature Extraction by Maximum Margin Criterion,” in Proc. of the Advances in Neural Information Processing Systems (NIPS), pp. 157–165, 2003.
Li, Z., Liu, W., Lin, D. and Tang, X., “Nonparametric Subspace Analysis for Face Recognition,” in Proc. of the IEEE Computer Vision and Pattern Recognition (CVPR), pp. 961–966, 2005.
Jolliffe, I. T., Principal Component Analysis, Springer-Verlag, New York, 1986.
Ji, S. W. and Ye, J. P., “A Unified Framework for Generalized Linear Discriminant analysis,” in Proc. of IEEE Computer Vision and Pattern Recognition (CVPR), pp. 1–7, 2008.
Nock, R. and Nielsen, F., “A Real Generalization of Discrete AdaBoost,” Artificial Intelligence, 171, 1, pp. 25–41, 2007.
Qiu, X. and Wu, L., “Face Recognition by Stepwise Nonparametric Margin Maximum criterion,” in Proc. of the International Conference on Computer Vision (ICCV), pp. 1567–1572, 2005.
Sharma, A. and Paliwal, K. K., “Fast Principal Component Analysis Using Fixed-point algorithm,” Pattern Recognition Letters, 28, 10, pp. 1151–1155, 2007.
Sugiyama, M., “Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis,” J. Mach. Learn. Res., 8, pp. 1027–1061, 2007.
De la Torre, F. and Kanade, T., “Multimodal Oriented Discriminant Analysis,” in Proc. of International Conference on Machine Learning (ICML), 119, pp. 177–184, 2005.
Wang, F. and Zhang, C., “Feature Extraction by Maximizing the Average Neighborhood margin,” in Proc. of the IEEE Computer Vision and Pattern Recognition (CVPR), pp. 1–8, 2007.
Wang, H., Zheng, W., Hu, Z. and Chen, S., “Local and Weighted Maximum Margin Discriminant Analysis,” in Proc. of the IEEE Computer Vision and Pattern Recognition (CVPR), pp. 1–8, 2007.
Weinberger, K. and Saul, L., “Distance Metric Learning for Large Margin Nearest Neighbor Classification,” Journal of Machine Learning Research, 10, pp. 207–244, 2009.
Wang, L., Wang, X., Zhang, X. and Feng, J., “The equivalence of two-dimensional PCA to line-based PCA,” Pattern Recognition Letters, 26, 1 pp :57–60, 2005.
Wang, L., Wang, X. and Feng, J., “On Image Matrix Based Feature Extraction Algorithms”, IEEE Trans. Systems, Man and Cybernetics, 36, 1, pp. 194–197, 2006.
Wang, L., Sugiyama, M., Yang, C., Zhou, Z. and Feng, J., “On the Margin Explanation of Boosting Algorithms,” in Proc. of the 21st Annual Conference on Learning Theory (COLT), pp. 479–490, 2008.
Wang, X. and Tang, X., “Dual-space Linear Discriminant Analysis for Face Recognition,” in Proc. of the IEEE Computer Vision and Pattern Recognition (CVPR), pp. 564–569, 2004.
Wang, X. and Tang, X., “A Unified Framework for Subspace Face Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., 26, 9, pp. 1222–1228, 2004.
Yang, C., Wang, L. and Feng, J., “On Feature Extraction via Kernels,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, 38, 2, pp. 553–557, 2008.
Yu, H. and Yang, J., “A Direct LDA Algorithm for High-dimensional Data with Application to Face Recognition,” Pattern Recognition, 34, 10, pp. 2067–2070, 2001.
Yang, J., Frangi, A. F., Yang, J., Zhang, D. and Jin, Z., “KPCA plus LDA: A Complete Kernel Fisher Discriminant Framework for Feature Extraction and Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., 27, 2, pp. 230–244, 2005.
Ye, J. and Li, Q., “A Two-stage Linear Discriminant Analysis via QR-decomposition,” IEEE Trans. Pattern Anal. Mach. Intell., 27, 6, pp. 929–941, 2005.
Ye, J., Janardan, R. and Li, Q., “Two-dimensional Linear Discriminant Analysis,” in Proc. of the Advances in Neural Information Processing Systems (NIPS), 2004.
Vapnik, V., The Nature of Statistical Learning Theory, Springer-Verlag, 1995.
Zhang, S. and Sim, T., “Discriminant Subspace Analysis: A Fukunaga-koontz Approach,” IEEE Trans. Pattern Anal. Mach. Intell., 29, 10, pp. 1732–1745, 2007.
Zhao, W., Chellappa, R. and Nandhakumar, N., “Empirical Performance Analysis of Linear Discriminant Classifiers,” in Proc. of the IEEE Computer Vision and Pattern Recognition (CVPR), pp. 164–169, 1998.
Zheng, W., Zou, C. and Zhao, L., “Weighted Maximum Margin Discriminant Analysis with Kernels,” Neurocomputing, 67, pp. 357–362, 2005.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Yang, C., Wang, L. & Feng, J. A Novel Margin Based Algorithm for Feature Extraction. New Gener. Comput. 27, 285–305 (2009). https://doi.org/10.1007/s00354-009-0066-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-009-0066-z