Skip to main content

Part of the book series: Springer Handbooks ((SHB))

Abstract

This tutorial provides a brief overview of a number of important tools that form the crux of the modern machine learning toolbox. These tools can be used for supervised learning, unsupervised learning, reinforcement learning and their numerous variants developed over the years. Because of the lack of space, this survey is not intended to be comprehensive. Interested readers are referred to conference proceedings such as Neural Information Processing Systems (GlossaryTerm

NIPS

) and the International Conference on Machine Learning (GlossaryTerm

ICML

) for the most recent advances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 269.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 349.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

BIC:

Bayesian information criterion

BMA:

Bayes model averaging

BMF:

binary matrix factorization

BYY:

Bayesian Yin-Yang

DCA:

de-correlated component analysis

EM:

expectation maximization

FA:

factor analysis

HMM:

hidden Markov model

HT:

Hough transform

ICA:

independent component analysis

ICML:

International Conference on Machine Learning

IFA:

independent factor analysis

KPCA:

kernel principal component analysis

LDA:

linear discriminant analysis

LFA:

local factor analysis

MCA:

minor component analysis

MDL:

minimum description length

MDP:

Markov decision process

MIL:

multi-instance learning

MIML:

multi-instance, multi-label learning

MLR:

multi-response linear regression

MSA:

minor subspace analysis

MTFL:

multi-task feature learning

MTL:

multi-task learning

NFA:

non-Gaussian factor analysis

NIPS:

neural information processing system

NMF:

nonnegative matrix factorization

PCA:

principal component analysis

PSA:

principal subspace analysis

RBF:

radial basis function

RHT:

randomized Hough transform

RMTL:

regularized multi-task learning

RPCL:

rival penalized competitive learning

S3VM:

semi-supervised support vector machine

SARSA:

state-action-reward-state-action

SBF:

subspace-based function

SSM:

state–space model

TD:

temporal difference

TFA:

temporal factor analysis

WTA:

winner-take-all

References

  1. H. Simon: Why should machines learn? In: Machine Learning. An Artificial Intelligence Approach, ed. by I.R. Anderson, R.S. Michalski, J.G. Carbonell, T.M. Mitchell (Tioga Publ., Palo Alto 1983)

    Google Scholar 

  2. L. Xu: Bayesian Ying Yang learning, Scholarpedia 2(3), 1809 (2007)

    Article  Google Scholar 

  3. L. Xu: Bayesian Ying-Yang system, best harmony learning, and five action circling, Front. Electr. Electr. Eng. China 5(3), 281–328 (2010)

    Article  Google Scholar 

  4. L. Xu, S. Klasa, A. Yuille: Recent advances on techniques static feed-forward networks with supervised learning, Int. J. Neural Syst. 3(3), 253–290 (1992)

    Article  Google Scholar 

  5. L. Xu: Learning algorithms for RBF functions and subspace based functions. In: Handbook of Research on Machine Learning, Applications and Trends: Algorithms, Methods and Techniques, ed. by E. Olivas, J.D.M. Guerrero, M.M. Sober, J.R.M. Benedito, A.J.S. López (Inform. Sci. Ref., Hershey 2009) pp. 60–94

    Google Scholar 

  6. L. Xu: Several streams of progresses on unsupervised learning: A tutorial overview, Appl. Inf. 1 (2013)

    Google Scholar 

  7. A. Jain: Data clustering: 50 years beyond k-means, Pattern Recognit. Lett. 31, 651–666 (2010)

    Article  Google Scholar 

  8. H. Kriegel, P. Kroger, A. Zimek: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering, ACM Trans. Knowl. Discov. Data 3(1), 1 (2009)

    Article  Google Scholar 

  9. H. Yin: Advances in adaptive nonlinear manifolds and dimensionality reduction, Front. Electr. Electr. Eng. China 6(1), 72–85 (2011)

    Article  Google Scholar 

  10. T.T. Kohonen Honkela: Kohonen network, Scholarpedia 2(1), 1568 (2007)

    Article  Google Scholar 

  11. L. Xu, J. Neufeld, B. Larson, D. Schuurmans: Maximum margin clustering, Adv. Neural Inf. Process. Syst. (2004) pp. 1537–1544

    Google Scholar 

  12. K. Zhang, I. Tsang, J. Kwok: Maximum margin clustering made practical, IEEE Trans. Neural Netw. 20(4), 583–596 (2009)

    Article  Google Scholar 

  13. Y.-F. Li, I. Tsang, J. Kwok, Z.-H. Zhou: Tighter and convex maximum margin clustering, Proc. 12th Int. Conf. Artif. Intell. Stat. (2009)

    Google Scholar 

  14. Z.-H. Zhou: Ensemble Methods: Foundations and Algorithms (Taylor Francis, Boca Raton 2012)

    Google Scholar 

  15. G. Tsoumakas, I. Katakis, I. Vlahavas: Mining multi-label data. In: Data Mining and Knowledge Discovery Handbook, 2nd edn., ed. by O. Maimon, L. Rokach (Springer, Berlin, Heidelberg 2010)

    Google Scholar 

  16. C. Silla, A. Freitas: A survey of hierarchical classification across different application domains, Data Min. Knowl. Discov. 22(1/2), 31–72 (2010)

    MathSciNet  MATH  Google Scholar 

  17. W. Bi, J. Kwok: Multi-label classification on tree- and DAG-structured hierarchies, Proc. 28th Int. Conf. Mach. Learn. (2011)

    Google Scholar 

  18. W. Bi, J. Kwok: Hierarchical multilabel classification with minimum Bayes risk, Proc. Int. Conf. Data Min. (2012)

    Google Scholar 

  19. W. Bi, J. Kwok: Mandatory leaf node prediction in hierarchical multilabel classification, Adv. Neural Inf. Process. Syst. (2012)

    Google Scholar 

  20. T.G. Dietterich, R.H. Lathrop, T. Lozano-Pérez: Solving the multiple-instance problem with axis-parallel rectangles, Artif. Intell. 89(1-2), 31–71 (1997)

    Article  MATH  Google Scholar 

  21. Z.-H.M.-L. Zhou Zhang: Solving multi-instance problems with classifier ensemble based on constructive clustering, Knowl. Inf. Syst. 11(2), 155–170 (2007)

    Article  Google Scholar 

  22. Z.-H. Zhou, Y.-Y. Sun, Y.-F. Li: Multi-instance learning by treating instances as non-i.i.d. samples, Proc. 26th Int. Conf. Mach. Learn. (2009) pp. 1249–1256

    Google Scholar 

  23. Z.-H. Zhou, J.-M. Xu: On the relation between multi-instance learning and semi-supervised learning, Proc. 24th Int. Conf. Mach. Learn. (2007) pp. 1167–1174

    Google Scholar 

  24. N. Weidmann, E. Frank, B. Pfahringer: A two-level learning method for generalized multi-instance problem, Proc. 14th Eur. Conf. Mach. Learn. (2003) pp. 468–479

    Google Scholar 

  25. S.D. Scott, J. Zhang, J. Brown: On generalized multiple-instance learning, Int. J. Comput. Intell. Appl. 5(1), 21–35 (2005)

    Article  Google Scholar 

  26. Z.-H. Zhou, M.-L. Zhang, S.-J. Huang, Y.-F. Li: Multi-instance multi-label learning, Artif. Intell. 176(1), 2291–2320 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  27. J. Foulds, E. Frank: A review of multi-instance learning assumptions, Knowl. Eng. Rev. 25(1), 1–25 (2010)

    Article  Google Scholar 

  28. L. Xu, A. Krzyzak, C. Suen: Several methods for combining multiple classifiers and their applications in handwritten character recognition, IEEE Trans. Syst. Man Cybern. SMC 22(3), 418–435 (1992)

    Article  Google Scholar 

  29. J. Kittler, M. Hatef, R. Duin, J. Matas: On combining classifiers, IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)

    Article  Google Scholar 

  30. L. Xu, S.I. Amari: Combining classifiers and learning mixture-of-experts. In: Encyclopedia of Artificial Intelligence, ed. by J. Dopioco, J. Dorado, A. Pazos (Inform. Sci. Ref., Hershey 2008) pp. 318–326

    Google Scholar 

  31. A. Blum, T. Mitchell: Combining labeled and unlabeled data with co-training, Proc. 11th Annu. Conf. Comput. Learn. Theory (1998) pp. 92–100

    Google Scholar 

  32. S. Abney: Bootstrapping, Proc. 40th Annu. Meet. Assoc. Comput. Linguist. (2002) pp. 360–367

    Google Scholar 

  33. M.-F. Balcan, A. Blum, K. Yang: Co-training and expansion: Towards bridging theory and practice, Adv. Neural Inf. Process. Syst. (2005) pp. 89–96

    Google Scholar 

  34. W. Wang, Z.-H. Zhou: A new analysis of co-training, Proc. 27th Int. Conf. Mach. Learn. (2010) pp. 1135–1142

    Google Scholar 

  35. Z.-H. Zhou, D.-C. Zhan, Q. Yang: Semi-supervised learning with very few labeled training examples, Proc. 22nd AAAI Conf. Artif. Intell. (2007) pp. 675–680

    Google Scholar 

  36. W. Wang, Z.-H. Zhou: Multi-view active learning in the non-realizable case, Adv. Neural Inf. Process. Syst. (2010) pp. 2388–2396

    Google Scholar 

  37. R. Caruana: Multitask learning, Mach. Learn. 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  38. T. Evgeniou, M. Pontil: Regularized multi-task learning, Proc. 10th Int. Conf. Know. Discov. Data Min. (2004) pp. 109–117

    Google Scholar 

  39. T. Evgeniou, C.A. Micchelli, M. Pontil: Learning multiple tasks with kernel methods, J. Mach. Learn. Res. 6, 615–637 (2005)

    MathSciNet  MATH  Google Scholar 

  40. A. Argyriou, T. Evgeniou, M. Pontil: Multi-task feature learning, Adv. Neural Inf. Process. Syst. (2007) pp. 41–48

    Google Scholar 

  41. A. Argyriou, T. Evgeniou, M. Pontil: Convex multi-task feature learning, Mach. Learn. 73(3), 243–272 (2008)

    Article  Google Scholar 

  42. T. Kato, H. Kashima, M. Sugiyama, K. Asai: Multi-task learning via conic programming, Adv. Neural Inf. Process. Syst. (2007) pp. 737–744

    Google Scholar 

  43. R. Ando, T. Zhang: A framework for learning predictive structures from multiple tasks and unlabeled data, J. Mach. Learn. Res. 6, 1817–1853 (2005)

    MathSciNet  MATH  Google Scholar 

  44. Y. Zhang, D.-Y. Yeung: A convex formulation for learning task relationships in multi-task learning, Proc. 24th Conf. Uncertain. Artif. Intell. (2010) pp. 733–742

    Google Scholar 

  45. L. Jacob, F. Bach, J. Vert: Clustered multi-task learning: A convex formulation, Adv. Neural Inf. Process. Syst. (2008) pp. 745–752

    Google Scholar 

  46. L.J. Zhong Kwok: Convex multitask learning with flexible task clusters, Proc. 29th Int. Conf. Mach. Learn. (2012)

    Google Scholar 

  47. J. Chen, J. Zhou, J. Ye: Integrating low-rank and group-sparse structures for robust multi-task learning, Proc. 17th Int. Conf. Knowl. Discov. Data Min. (2011) pp. 42–50

    Google Scholar 

  48. S. Pan, J. Kwok, Q. Yang, J. Pan: Adaptive localization in A dynamic WiFi environment through multi-view learning, Proc. 22nd AAAI Conf. Artif. Intell. (2007) pp. 1108–1113

    Google Scholar 

  49. S. Pan, J. Kwok, Q. Yang: Transfer learning via dimensionality reduction, Proc. 23rd AAAI Conf. Artif. Intell. (2008)

    Google Scholar 

  50. S. Pan, I. Tsang, J. Kwok, Q. Yang: Domain adaptation via transfer component analysis, IEEE Trans. Neural Netw. 22(2), 199–210 (2011)

    Article  Google Scholar 

  51. W. Dai, Q. Yang, G. Xue, Y. Yu: Boosting for transfer learning, Proc. 24th Int. Conf. Mach. Learn. (2007) pp. 193–200

    Google Scholar 

  52. J. Huang, A. Smola, A. Gretton, K. Borgwardt, B. Schölkopf: Correcting sample selection bias by unlabeled data, Adv. Neural Inf. Process. Syst. (2007) pp. 601–608

    Google Scholar 

  53. M. Sugiyama, S. Nakajima, H. Kashima, P.V. Buenau, M. Kawanabe: Direct importance estimation with model selection and its application to covariate shift adaptation, Adv. Neural Inf. Process. Syst. (2008)

    Google Scholar 

  54. C. Elkan: The foundations of cost-sensitive learning, Proc. 17th Int. Jt. Conf. Artif. Intell. (2001) pp. 973–978

    Google Scholar 

  55. Z.-H. Zhou, X.-Y. Liu: On multi-class cost-sensitive learning, Proc. 21st Natl. Conf. Artif. Intell. (2006) pp. 567–572

    Google Scholar 

  56. X.-Y. Liu, Z.-H. Zhou: Learning with cost intervals, Proc. 16th Int. Conf. Knowl. Discov. Data Min. (2010) pp. 403–412

    Google Scholar 

  57. P.D. Turney: Types of cost in inductive concept learning, Proc. 17th Int. Conf. Mach. Learn. (2000) pp. 15–21

    Google Scholar 

  58. L. Xu: On essential topics of BYY harmony learning: Current status, challenging issues, and gene analysis applications, Front. Electr. Elect. Eng. China 7(1), 147–196 (2012)

    Google Scholar 

  59. L. Xu: Semi-blind bilinear matrix system, BYY harmony learning, and gene analysis applications, Proc. 6th Int. Conf. New Trends Inf. Sci. Serv. Sci. Data Min. (2012) pp. 661–666

    Google Scholar 

  60. L. Xu: Independent subspaces. In: Encyclopedia of Artificial Intelligence, ed. by J. Dopioco, J. Dorado, A. Pazos (Inform. Sci. Ref., Hershey 2008) pp. 903–912

    Google Scholar 

  61. L. Xu: Independent component analysis and extensions with noise and time: A Bayesian Ying-Yang learning perspective, Neural Inf. Process. Lett. Rev. 1(1), 1–52 (2003)

    Google Scholar 

  62. L. Xu: One-bit-matching ICA theorem, convex-concave programming, and distribution approximation for combinatorics, Neural Comput. 19, 546–569 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  63. S. Tu, L. Xu: Parameterizations make different model selections: Empirical findings from factor analysis, Front. Electr. Electr. Eng. China 6(2), 256–274 (2011)

    Article  MathSciNet  Google Scholar 

  64. P. Williams: Bayesian regularization and pruning using A Laplace prior, Neural Comput. 7(1), 117–143 (1995)

    Article  Google Scholar 

  65. R. Tibshirani: Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B: Methodol. 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  66. M. Figueiredo, A. Jain: Unsupervised learning of finite mixture models, IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)

    Article  Google Scholar 

  67. C. McGrory, D. Titterington: Variational approximations in Bayesian model selection for finite mixture distributions, Comput. Stat. Data Anal. 51(11), 5352–5367 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  68. A. Corduneanu, C. Bishop: Variational Bayesian model selection for mixture distributions, Proc. 8th Int. Conf. Artif. Intell. Stat. (2001) pp. 27–34

    Google Scholar 

  69. L. Xu: Rival penalized competitive learning, Scholarpedia 2(8), 1810 (2007)

    Article  Google Scholar 

  70. L. Xu: A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving, Pattern Recognit. 40(8), 2129–2153 (2007)

    Article  MATH  Google Scholar 

  71. L. Xu: BYY harmony learning, structural RPCL, and topological self-organizing on mixture models, Neural Netw. 8-9, 1125–1151 (2002)

    Article  Google Scholar 

  72. L. Xu, M. Jordan, G. Hinton: An alternative model for mixtures of experts, Adv. Neural Inf. Process. Syst. (1995) pp. 633–640

    Google Scholar 

  73. D. Lee, H. Seung: Learning the parts of objects by non-negative matrix factorization, Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  74. S. Madeira: A. Oliveira, Biclustering algorithms for biological data analysis: A survey, IEEE Trans. Comput. Biol. Bioinform. 1(1), 25–45 (2004)

    Article  Google Scholar 

  75. S. Tu, R. Chen, L. Xu: A binary matrix factorization algorithm for protein complex prediction, Proteome Sci. 9(Suppl 1), S18 (2011)

    Article  Google Scholar 

  76. X. He, P. Niyogi: Locality preserving projections, Adv. Neural Inf. Process. Syst. (2003) pp. 152–160

    Google Scholar 

  77. X. He, B. Lin: Tangent space learning and generalization, Front. Electr. Electr. Eng. China 6(1), 27–42 (2011)

    Article  Google Scholar 

  78. M.M. Meila Jordan: Learning with mixtures of trees, J. Mach. Learn. Res. 1, 1–48 (2000)

    MathSciNet  MATH  Google Scholar 

  79. J. Pearl: Fusion, propagation and structuring in belief networks, Artif. Intell. 29(3), 241–288 (1986), Sep.

    Article  MathSciNet  MATH  Google Scholar 

  80. L. Xu, J. Pearl: Structuring causal tree models with continuous variables, Proc. 3rd Annu. Conf. Uncertain. Artif. Intell. (1987) pp. 170–179

    Google Scholar 

  81. A. Barto: Temporal difference learning, Scholarpedia 2(11), 1604 (2007)

    Article  Google Scholar 

  82. F. Woergoetter, B. Porr: Reinforcement learning, Scholarpedia 3(3), 1448 (2008)

    Article  Google Scholar 

  83. O. Chapelle, B. Schölkopf, A. Zien: Semi-Supervised Learning (MIT, Cambridge 2006)

    Book  Google Scholar 

  84. X. Zhu: Semi-supervised learning literature survey (Univ. of Wisconsin, Madison 2008)

    Google Scholar 

  85. Z.-H. Zhou, M. Li: Semi-supervised learning by disagreement, Knowl. Inform. Syst. 24(3), 415–439 (2010)

    Article  Google Scholar 

  86. Z.-H. Zhou: When semi-supervised learning meets ensemble learning, Front. Electr. Electr. Eng. China 6(1), 6–16 (2011)

    Article  Google Scholar 

  87. V.N. Vapnik: Statistical Learning Theory (Wiley, New York 1998)

    MATH  Google Scholar 

  88. Y.-F. Li, Z.-H. Zhou: Towards making unlabeled data never hurt, Proc. 28th Int. Conf. Mach. Learn. (2011) pp. 1081–1088

    Google Scholar 

  89. A. Fred, A.K. Jain: Data clustering using evidence accumulation, Proc. 16th Int. Conf. Pattern Recognit. (2002) pp. 276–280

    Google Scholar 

  90. A. Strehl, J. Ghosh: Cluster ensembles – A knowledge reuse framework for combining multiple partitions, J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  91. R. Jacobs, M. Jordan, S. Nowlan, G. Hinton: Adaptive mixtures of local experts, Neural Comput. 3, 79–87 (1991)

    Article  Google Scholar 

  92. R.E. Schapire: The strength of weak learnability, Mach. Learn. 5(2), 197–227 (1990)

    Google Scholar 

  93. Y. Freund, R.E. Schapire: A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  94. J. Friedman, T. Hastie, R. Tibshirani: Additive logistic regression: A statistical view of boosting (with discussions), Ann. Stat. 28(2), 337–407 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  95. R.E. Schapire, Y. Singer: Improved boosting algorithms using confidence-rated predictions, Mach. Learn. 37(3), 297–336 (1999)

    Article  MATH  Google Scholar 

  96. J. Zhu, S. Rosset, H. Zou, T. Hastie: Multi-class AdaBoost, Stat. Interface 2, 349–360 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  97. R.E. Schapire, Y. Freund, P. Bartlett, W.S. Lee: Boosting the margin: A new explanation for the effectiveness of voting methods, Ann. Stat. 26(5), 1651–1686 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  98. L. Breiman: Prediction games and arcing algorithms, Neural Comput. 11(7), 1493–1517 (1999)

    Article  Google Scholar 

  99. L. Breiman: Bagging predictors, Mach. Learn. 24(2), 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  100. C. Domingo, O. Watanabe: Madaboost: A modification of AdaBoost, Proc. 13th Annu. Conf. Comput. Learn. Theory (2000) pp. 180–189

    Google Scholar 

  101. Y. Freund: An adaptive version of the boost by majority algorithm, Mach. Learn. 43(3), 293–318 (2001)

    Article  MATH  Google Scholar 

  102. B. Efron, R. Tibshirani: An Introduction to the Bootstrap (Chapman Hall, New York 1993)

    Book  MATH  Google Scholar 

  103. A. Buja, W. Stuetzle: Observations on bagging, Stat. Sin. 16(2), 323–351 (2006)

    MathSciNet  MATH  Google Scholar 

  104. J.H.P. Friedman Hall: On bagging and nonlinear estimation, J. Stat. Plan. Inference 137(3), 669–683 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  105. L. Breiman: Random forests, Mach. Learn. 45(1), 5–32 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  106. D.H. Wolpert: Stacked generalization, Neural Netw. 5(2), 241–260 (1992)

    Article  MathSciNet  Google Scholar 

  107. L. Breiman: Stacked regressions, Mach. Learn. 24(1), 49–64 (1996)

    MathSciNet  MATH  Google Scholar 

  108. P. Smyth, D. Wolpert: Stacked density estimation, Adv. Neural Inf. Process. Syst. (1998) pp. 668–674

    Google Scholar 

  109. L. Xu, A. Krzyzak, C. Sun: Associative switch for combining multiple classifiers, Int. Jt. Conf. Neural Netw. (1991) pp. 43–48

    Google Scholar 

  110. K.M. Ting, I.H. Witten: Issues in stacked generalization, J. Artif. Intell. Res. 10, 271–289 (1999)

    MATH  Google Scholar 

  111. A.K. Seewald: How to make stacking better and faster while also taking care of an unknown weakness, Proc. 19th Int. Conf. Mach. Learn. (2002) pp. 554–561

    Google Scholar 

  112. B. Clarke: Comparing Bayes model averaging and stacking when model approximation error cannot be ignored, J. Mach. Learn. Res. 4, 683–712 (2003)

    MathSciNet  MATH  Google Scholar 

  113. A. Krogh, J. Vedelsby: Neural network ensembles, cross validation, and active learning, Adv. Neural Inf. Process. Syst. (1995) pp. 231–238

    Google Scholar 

  114. N.R. Ueda Nakano: Generalization error of ensemble estimators, Proc. IEEE Int. Conf. Neural Netw. (1996) pp. 90–95

    Google Scholar 

  115. G. Brown, J.L. Wyatt, P. Tino: Managing diversity in regression ensembles, J. Mach. Learn. Res. 6, 1621–1650 (2005)

    MathSciNet  MATH  Google Scholar 

  116. Z.-H. Zhou, J. Wu, W. Tang: Ensembling neural networks: Many could be better than all, Artif. Intell. 137(1-2), 239–263 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  117. L.I. Kuncheva, C.J. Whitaker: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy, Mach. Learn. 51(2), 181–207 (2003)

    Article  MATH  Google Scholar 

  118. P. Devijver, J. Kittler: Pattern Recognition: A Statistical Approach (Prentice Hall, New York 1982)

    MATH  Google Scholar 

  119. Y. Saeys, I. Inza, P. Larraaga: A review of feature selection techniques in bioinformatics, Bioinformatics 19(23), 2507–2517 (2007)

    Article  Google Scholar 

  120. I. Guyon, A. Elisseeff: An introduction to variable and feature selection, J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  121. A. Jain, R. Duin, J. Mao: Statistical pattern recognition: A review, IEEE Trans. Pattern Anal. Mach. Intell. 22, 1 (2000)

    Article  Google Scholar 

  122. I. Guyon, J. Weston, S. Barnhill, V. Vapnik: Gene selection for cancer classification using support vector machines, Mach. Learn. 46(1-3), 389–422 (2002)

    Article  MATH  Google Scholar 

  123. M. Dash, K. Choi, P. Scheuermann, H. Liu: Feature selection for clustering -- A filter solution, Proc. 2nd Int. Conf. Data Min. (2002) pp. 115–122, Dec.

    Google Scholar 

  124. M. Law, M. Figueiredo, A. Jain: Simultaneous feature selection and clustering using mixture models, IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1154–1166 (2004)

    Article  Google Scholar 

  125. P. Mitra, C. Murthy, S.K. Pal: Unsupervised feature selection using feature similarity, IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)

    Article  Google Scholar 

  126. V. Roth: The generalized LASSO, IEEE Trans. Neural Netw. 15(1), 16–28 (2004)

    Article  Google Scholar 

  127. C. Constantinopoulos, M. Titsias, A. Likas: Bayesian feature and model selection for Gaussian mixture models, IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 1013–1018 (2006)

    Article  Google Scholar 

  128. J. Dy, C. Brodley: Feature selection for unsupervised learning, J. Mach. Learn. Res. 5, 845–889 (2004)

    MathSciNet  MATH  Google Scholar 

  129. B. Zhao, J. Kwok, F. Wang, C. Zhang: Unsupervised maximum margin feature selection with manifold regularization, Proc. Int. Conf. Comput. Vis. Pattern Recognit. (2009)

    Google Scholar 

  130. B. Turlach, W. Venables, S. Wright: Simultaneous variable selection, Technometrics 27, 349–363 (2005)

    Article  MathSciNet  Google Scholar 

  131. B. Schölkopf, A. Smola: Learning with Kernels (MIT Press, Cambridge 2002)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James T. Kwok .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kwok, J.T., Zhou, ZH., Xu, L. (2015). Machine Learning. In: Kacprzyk, J., Pedrycz, W. (eds) Springer Handbook of Computational Intelligence. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43505-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43505-2_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43504-5

  • Online ISBN: 978-3-662-43505-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics