Abstract
The existing Multi-View Learning (MVL) is to discuss how to learn from patterns with multiple information sources and has been proven its superior generalization to the usual Single-View Learning (SVL). However, in most real-world cases there are just single source patterns available such that the existing MVL cannot work. The purpose of this paper is to develop a new multi-view regularization learning for single source patterns. Concretely, for the given single source patterns, we first map them into M feature spaces by M different empirical kernels, then associate each generated feature space with our previous proposed Discriminative Regularization (DR), and finally synthesize M DRs into one single learning process so as to get a new Multi-view Discriminative Regularization (MVDR), where each DR can be taken as one view of the proposed MVDR. The proposed method achieves: (1) the complementarity for multiple views generated from single source patterns; (2) an analytic solution for classification; (3) a direct optimization formulation for multi-class problems without one-against-all or one-against-one strategies.
Similar content being viewed by others
References
Bach F, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st international conference on machine learning
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 1: 1–48
Bennett KP, Momma M, Embrechts MJ (2002) MARK: a boosting algorithm for heterogeneous kernel models. In: SIGKDD, pp 24–31
Bi J, Zhang T, Bennett K (2004) Column-generation boosting methods for mixture of kernels. In: KDD, pp 521–526
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the conference on computational learning theory
Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3): 131–159
Chen Z, Haykin S (2002) On different facets of regularization theory. Neural Comput 14(12): 2791–2846
Chen S, Hong X, Harris C (2004) Sparse kernel density construction using orthogonal forward regression with leave-one-out test score and local regularization. IEEE Trans Syst Man Cybern B 34(4): 1708–1717
Cristianini N, Elisseef A, Shawe-Taylor J (2001) On kernel-target alignment. In: Advances in neural information processing systems
Dai D, Yuen P (2007) Face recognition by regularized discriminant analysis. IEEE Trans Syst Man Cybern B 37(4): 1080–1085
de Diego IM, Moguerza JM, Munoz A (2004) Combining kernel information for support vector classification. In: MCS, LNCS, pp 102–111
Duda R, Hart P, Stork D (2001) Pattern classification. Wiley, New York
Duin R, Pekalska E (2006) Object representation, sample size and data complexity. In: Basu M, Ho TK (eds) Data complexity in pattern recognition. Springer, London, pp 25–47
Evgeniou T, Micchelli C, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6: 615–637
Farquhar J, Hardoon D, Meng H, Shawe-Taylor J, Szedmak S (2005) Two view learning: SVM-2K, theory and practice. In: NIPS
Grandvalet Y, Canu S (2002) Adaptive scaling for feature selection in SVMs. In: Neural information processing systems
Guo P, Lyu M, Chen C (2003) Regularization parameter estimation for feedforward neural networks. IEEE Trans Syst Man Cybern B 33(1): 35–44
Hardoon D, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16: 2639–2664
Haykin S (2001) Neural networks: a comprehensive foundation. Tsinghua University Press, Beijing
Lanckriet GRG, Bie TD, Cristianini N, Jordan MI, Noble WS (2004) A statistical framework for genomic data fusion. Bioinformatics 20(16): 2626–2635
Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5: 27–72
Lauer F, Bloch G (2007) Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing 71:1578–1594
Li H, Jiang T, Zhang K (2006) Efficient and robust feature extraction by maximun margin criterion. IEEE Trans Neural Netw 17(1): 157–165
Martinez A, Kak A (2001) Pca versus lda. IEEE Trans Pattern Anal Mach Intell 23(2): 228–233
Mitchell TM (1997) Machine learning. McGraw-Hill, Boston
Momma M, Bennett K (2002) A pattern search method for model selection of support vector regression. In: Proceedings of the second SIAM international conference on data mining, SIAM, pp 261–274
Morozov V (1984) Methods for solving incorrectly posed problems. Springer, New York
Muslea I, Kloblock C, Minton S (2002) Active + semi-supervised learning = robust multi-view learning. In: ICML
Newman DJ, Hettich S, Blake CL, Merz CJ (1998) Uci repository of machine learning databases. Available from: http://www.ics.uci.edu/mlearn/MLRepository.html
Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In: Proceedings of information and knowledge management
Ong CS, Smola AJ, Williamson RC (2005) Learning the kernel with hyperkernels. J Mach Learn Res 6: 1043–1071
Poggio T, Girosi F (1990) Regularization algorithms for learning that are equivalent to multilayer networks. Science 247: 978–982
Poggio T, Smale S (2003) The mathematics of learning: dealing with data. Notices AMS 50(5): 537–544
Rakotomamonjy A, Bach F, Canu S, Grandvalet Y (2007) More efficiency in multiple kernel learning. In: ICML
Scholkopf B, Mika S, Burges CJC, Knirsch P, Muller K-R, Ratsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5): 1000–1017
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University, Cambridge
Sonnenburg S, Ratsch G, Schafer C (2005) A general and efficient multiple kernel learning algorithm. In: Neural information processing systems
Szedmak S, Shawe-Taylor J (2005) Muticlass learning at one-class complexity. Technical report no: 1508, School of Electronics and Computer Science, Southampton, UK
Tikhonov A (1963) On solving incorrectly posed problems and method of regularization. Doklady Akademii Nauk USSR 151: 501–504
Tikhonov A, Aresnin V (1977) Solutions of ill-posed problems. Winston, Washington, DC
Tsang I, Kocsor A, Kwok J (2006) Efficient kernel feature extraction for massive data sets. In: International conference on knowledge discovery and data mining
Vapnik V (1998) Statistical learning theory. Wiley, New York
Wang W, Zhou Z (2007) Analyzing co-training style algorithms. In: Proceedings of the 18th European conference on machine learning (ECML’07)
Wang Z, Chen S, Sun T (2008) Multik-MHKS: a novel multiple kernel learning algorithm. IEEE Trans Pattern Anal Mach Intell 30: 348–353
Xiong H, Swamy MNS, Ahmad MO (2005) Optimizing the kernel in the empirical feature space. IEEE Trans Neural Netw 16(2): 460–474
Xu QS, Liang YZ (2001) Monte carlo cross validation. Chemom Intell Lab Syst 56: 1–11
Xue H, Chen S, Yang Q (2009) Discriminatively regularized least-squares classification. Pattern Recognit 42: 93–104
Zhang P, Peng J (2004) SVM vs regularized least squares classification. In: Proceedings of the 17th international conference on pattern recognition
Zhang K, Tang J, Li J, Wang K (2005) Feature-correlation based multi-view detection. In: ICCSA 2005, LNCS 3483, pp 1222–1230
Zhou Y, Goldman S (2004) Democratic co-learning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence (ICTAI2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Z., Chen, S., Xue, H. et al. A Novel Regularization Learning for Single-View Patterns: Multi-View Discriminative Regularization. Neural Process Lett 31, 159–175 (2010). https://doi.org/10.1007/s11063-010-9132-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-010-9132-2