Skip to main content
Log in

A Novel Regularization Learning for Single-View Patterns: Multi-View Discriminative Regularization

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The existing Multi-View Learning (MVL) is to discuss how to learn from patterns with multiple information sources and has been proven its superior generalization to the usual Single-View Learning (SVL). However, in most real-world cases there are just single source patterns available such that the existing MVL cannot work. The purpose of this paper is to develop a new multi-view regularization learning for single source patterns. Concretely, for the given single source patterns, we first map them into M feature spaces by M different empirical kernels, then associate each generated feature space with our previous proposed Discriminative Regularization (DR), and finally synthesize M DRs into one single learning process so as to get a new Multi-view Discriminative Regularization (MVDR), where each DR can be taken as one view of the proposed MVDR. The proposed method achieves: (1) the complementarity for multiple views generated from single source patterns; (2) an analytic solution for classification; (3) a direct optimization formulation for multi-class problems without one-against-all or one-against-one strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bach F, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st international conference on machine learning

  2. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 1: 1–48

    MathSciNet  Google Scholar 

  3. Bennett KP, Momma M, Embrechts MJ (2002) MARK: a boosting algorithm for heterogeneous kernel models. In: SIGKDD, pp 24–31

  4. Bi J, Zhang T, Bennett K (2004) Column-generation boosting methods for mixture of kernels. In: KDD, pp 521–526

  5. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the conference on computational learning theory

  6. Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3): 131–159

    Article  MATH  Google Scholar 

  7. Chen Z, Haykin S (2002) On different facets of regularization theory. Neural Comput 14(12): 2791–2846

    Article  MATH  Google Scholar 

  8. Chen S, Hong X, Harris C (2004) Sparse kernel density construction using orthogonal forward regression with leave-one-out test score and local regularization. IEEE Trans Syst Man Cybern B 34(4): 1708–1717

    Article  Google Scholar 

  9. Cristianini N, Elisseef A, Shawe-Taylor J (2001) On kernel-target alignment. In: Advances in neural information processing systems

  10. Dai D, Yuen P (2007) Face recognition by regularized discriminant analysis. IEEE Trans Syst Man Cybern B 37(4): 1080–1085

    Article  Google Scholar 

  11. de Diego IM, Moguerza JM, Munoz A (2004) Combining kernel information for support vector classification. In: MCS, LNCS, pp 102–111

  12. Duda R, Hart P, Stork D (2001) Pattern classification. Wiley, New York

    MATH  Google Scholar 

  13. Duin R, Pekalska E (2006) Object representation, sample size and data complexity. In: Basu M, Ho TK (eds) Data complexity in pattern recognition. Springer, London, pp 25–47

    Chapter  Google Scholar 

  14. Evgeniou T, Micchelli C, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6: 615–637

    MathSciNet  Google Scholar 

  15. Farquhar J, Hardoon D, Meng H, Shawe-Taylor J, Szedmak S (2005) Two view learning: SVM-2K, theory and practice. In: NIPS

  16. Grandvalet Y, Canu S (2002) Adaptive scaling for feature selection in SVMs. In: Neural information processing systems

  17. Guo P, Lyu M, Chen C (2003) Regularization parameter estimation for feedforward neural networks. IEEE Trans Syst Man Cybern B 33(1): 35–44

    Article  Google Scholar 

  18. Hardoon D, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16: 2639–2664

    Article  MATH  Google Scholar 

  19. Haykin S (2001) Neural networks: a comprehensive foundation. Tsinghua University Press, Beijing

    Google Scholar 

  20. Lanckriet GRG, Bie TD, Cristianini N, Jordan MI, Noble WS (2004) A statistical framework for genomic data fusion. Bioinformatics 20(16): 2626–2635

    Article  Google Scholar 

  21. Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5: 27–72

    Google Scholar 

  22. Lauer F, Bloch G (2007) Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing 71:1578–1594

    Article  Google Scholar 

  23. Li H, Jiang T, Zhang K (2006) Efficient and robust feature extraction by maximun margin criterion. IEEE Trans Neural Netw 17(1): 157–165

    Article  Google Scholar 

  24. Martinez A, Kak A (2001) Pca versus lda. IEEE Trans Pattern Anal Mach Intell 23(2): 228–233

    Article  Google Scholar 

  25. Mitchell TM (1997) Machine learning. McGraw-Hill, Boston

    MATH  Google Scholar 

  26. Momma M, Bennett K (2002) A pattern search method for model selection of support vector regression. In: Proceedings of the second SIAM international conference on data mining, SIAM, pp 261–274

  27. Morozov V (1984) Methods for solving incorrectly posed problems. Springer, New York

    Google Scholar 

  28. Muslea I, Kloblock C, Minton S (2002) Active + semi-supervised learning = robust multi-view learning. In: ICML

  29. Newman DJ, Hettich S, Blake CL, Merz CJ (1998) Uci repository of machine learning databases. Available from: http://www.ics.uci.edu/mlearn/MLRepository.html

  30. Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In: Proceedings of information and knowledge management

  31. Ong CS, Smola AJ, Williamson RC (2005) Learning the kernel with hyperkernels. J Mach Learn Res 6: 1043–1071

    MathSciNet  Google Scholar 

  32. Poggio T, Girosi F (1990) Regularization algorithms for learning that are equivalent to multilayer networks. Science 247: 978–982

    Article  MathSciNet  Google Scholar 

  33. Poggio T, Smale S (2003) The mathematics of learning: dealing with data. Notices AMS 50(5): 537–544

    MATH  MathSciNet  Google Scholar 

  34. Rakotomamonjy A, Bach F, Canu S, Grandvalet Y (2007) More efficiency in multiple kernel learning. In: ICML

  35. Scholkopf B, Mika S, Burges CJC, Knirsch P, Muller K-R, Ratsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5): 1000–1017

    Article  Google Scholar 

  36. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University, Cambridge

    Google Scholar 

  37. Sonnenburg S, Ratsch G, Schafer C (2005) A general and efficient multiple kernel learning algorithm. In: Neural information processing systems

  38. Szedmak S, Shawe-Taylor J (2005) Muticlass learning at one-class complexity. Technical report no: 1508, School of Electronics and Computer Science, Southampton, UK

  39. Tikhonov A (1963) On solving incorrectly posed problems and method of regularization. Doklady Akademii Nauk USSR 151: 501–504

    Google Scholar 

  40. Tikhonov A, Aresnin V (1977) Solutions of ill-posed problems. Winston, Washington, DC

  41. Tsang I, Kocsor A, Kwok J (2006) Efficient kernel feature extraction for massive data sets. In: International conference on knowledge discovery and data mining

  42. Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  43. Wang W, Zhou Z (2007) Analyzing co-training style algorithms. In: Proceedings of the 18th European conference on machine learning (ECML’07)

  44. Wang Z, Chen S, Sun T (2008) Multik-MHKS: a novel multiple kernel learning algorithm. IEEE Trans Pattern Anal Mach Intell 30: 348–353

    Article  Google Scholar 

  45. Xiong H, Swamy MNS, Ahmad MO (2005) Optimizing the kernel in the empirical feature space. IEEE Trans Neural Netw 16(2): 460–474

    Article  Google Scholar 

  46. Xu QS, Liang YZ (2001) Monte carlo cross validation. Chemom Intell Lab Syst 56: 1–11

    Article  MathSciNet  Google Scholar 

  47. Xue H, Chen S, Yang Q (2009) Discriminatively regularized least-squares classification. Pattern Recognit 42: 93–104

    Article  MATH  Google Scholar 

  48. Zhang P, Peng J (2004) SVM vs regularized least squares classification. In: Proceedings of the 17th international conference on pattern recognition

  49. Zhang K, Tang J, Li J, Wang K (2005) Feature-correlation based multi-view detection. In: ICCSA 2005, LNCS 3483, pp 1222–1230

  50. Zhou Y, Goldman S (2004) Democratic co-learning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence (ICTAI2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Songcan Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Chen, S., Xue, H. et al. A Novel Regularization Learning for Single-View Patterns: Multi-View Discriminative Regularization. Neural Process Lett 31, 159–175 (2010). https://doi.org/10.1007/s11063-010-9132-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-010-9132-2

Keywords

Navigation