A study on random weights between input and hidden layers in extreme learning machine

Wang, Ran; Kwong, Sam; Wang, Xizhao

doi:10.1007/s00500-012-0829-1

A study on random weights between input and hidden layers in extreme learning machine

Focus
Published: 12 February 2012

Volume 16, pages 1465–1475, (2012)
Cite this article

Soft Computing Aims and scope Submit manuscript

Ran Wang¹,
Sam Kwong¹ &
Xizhao Wang²

681 Accesses
25 Citations
Explore all metrics

Abstract

Extreme learning machine (ELM), as an emergent technique for training feed-forward neural networks, has shown good performances on various learning domains. This paper investigates the impact of random weights during the training of ELM. It focuses on the randomness of weights between input and hidden layers, and the dimension change from input layer to hidden layer. The direct motivation is to verify as to whether during the training of ELM, the randomly assigned weights exert some positive effects. Experimentally we show that for many classification and regression problems, the dimension increase caused by random weights in ELM has a performance better than the dimension increase caused by some kernel mappings. We assume that via the random transformation, output-samples are more concentrate than input-samples which will make the learning more efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What are Extreme Learning Machines? Filling the Gap Between Frank Rosenblatt’s Dream and John von Neumann’s Puzzle

Article 15 May 2015

A Study on the Randomness Reduction Effect of Extreme Learning Machine with Ridge Regression

A Survey on Extreme Learning Machine and Evolution of Its Variants

References

Chacko B, Vimal Krishnan V, Raju G, Babu Anto P (2011) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern. doi:10.1007/s13042-011-0049-5
Chen C, Zhang J, He X, Zhou Z (2011) Non-parametric kernel leanring with robust pairwise constraints. Int J Mach Learn Cybern. doi:10.1007/s13042-011-0048-6
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Feng G, Huang G, Lin Q, Gay R (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. Neural Netw IEEE Transact 20(8):1352–1357
Article Google Scholar
Haykin S (1999) Neural networks: a comprehensive foundation. Prentice hall, New Jersey
Huang G, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16–18):3056–3062
Article Google Scholar
Huang G, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468
Article Google Scholar
Huang G, Chen L, Siew C (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. Neural Netw IEEE Transact 17(4):879–892
Article Google Scholar
Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74(1-3):155–163
Article Google Scholar
Huang G, Siew C (2004) Extreme learning machine: Rbf network case. In: Eighth IEEE Control, Automation, Robotics and Vision Conference (ICARCV 2004), vol 2, pp 1029–1036
Huang G, Wang D, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122
Article Google Scholar
Huang G, Zhou H, Ding X, Zhang R (2010) Extreme learning machine for regression and multiclass classification. Syst Man Cybern Part B Cybern IEEE Transact 99:1–17
Google Scholar
Huang G, Zhu Q, Siew C (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of IEEE International Joint Conference on Neural Networks. vol 2, pp 985–990
Huang G, Zhu Q, Siew C (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Article Google Scholar
Jun W, Shitong W, Chung F (2011) Positive and negative fuzzy rule system, extreme learning machine and image classification. Int J Mach Learn Cybern 2(4):261–271
Article Google Scholar
Li M, Huang G, Saratchandran P, Sundararajan N (2005) Fully complex extreme learning machine. Neurocomputing 68:306–314
Article Google Scholar
Miche Y, Sorjamaa A, Bas P, Simula O, Jutten C, Lendasse A (2010) Op-elm: optimally pruned extreme learning machine. Neural Netw IEEE Transact 21(1):158–162
Article Google Scholar
Michell T (1997) Machine learning. McGrawHill, USA
Nigrin A (1993) Neural networks for pattern recognition. The MIT press, Cambridge
Rong H, Huang G, Sundararajan N, Saratchandran P (2009) Online sequential fuzzy extreme learning machine for function approximation and classification problems. Syst Man Cybern Part B Cybern IEEE Transact 39(4):1067–1072
Article Google Scholar
Rumelhart D, Hintont G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Article Google Scholar
Schölkopf B, Smola A (2002) Learning with kernels. The MIT Press, Cambridge
Tang X, Han M (2009) Partial lanczos extreme learning machine for single-output regression problems. Neurocomputing 72(13):3066–3076
Article Google Scholar
Vapnik V (2000) The nature of statistical learning theory. Springer, Berlin
Wang D, Huang G (2005) Protein sequence classification using extreme learning machine. In: Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN’05). vol 3, pp 1406–1411
Wang W, Chen A, Feng H (2011) Upper integral network with extreme learning mechanism. Neurocomputing 74(16):2520–2525
Article Google Scholar
Zhu Q, Qin A, Suganthan P, Huang G (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763
Article MATH Google Scholar

Download references

Acknowledgments

This paper is partly supported by City University Strategic Research Grant (SRG) 7002680.

Author information

Authors and Affiliations

Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong
Ran Wang & Sam Kwong
Department of Mathematics and Computer Science, Hebei University, Baoding, Hebei, 071002, People’s Republic of China
Xizhao Wang

Authors

Ran Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sam Kwong
View author publications
You can also search for this author in PubMed Google Scholar
Xizhao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ran Wang.

Appendix

1.1 A Kernel function and its determinant

1.1.1 A.1 Definition of Kernel function

Definition 1

(Kernel function) Suppose that \(\mathbf{X}\) is a subset of \(\mathbf{R}^n.\) The function \(k(\mathbf{x}_i,\mathbf{x}_j)\) defined on \(\mathbf{X}\times\mathbf{X}\) is called a kernel function if there exists a mapping \({\phi:\mathbf{X}\to\mathbb{H},\mathbf{x}\to\phi(\mathbf{x}), }\) such that \(k(\mathbf{x}_i,\mathbf{x}_j)=\langle\phi(\mathbf{x}_i),\phi(\mathbf{x}_j)\rangle,\) where \({\mathbb{H}}\) denotes a Hilbert space, and \(\langle,\rangle\) denotes the inner product of \({\mathbb{H}}.\)

1.1.2 A.2 Mercer Theorem

Theorem 1

(Mercer Theorem) Suppose that χ is a compact set of \(\mathbf{R}^n.\) For any symmetric function \(k(\mathbf{x}_i,\mathbf{x}_j)\) defined on χ × χ, the necessary and sufficient condition for it being the inner product of a feature space is that: for any \(\phi(\mathbf{x})\neq0\) and \(\int \phi^2(\mathbf{x})d\mathbf{x}< \infty,\) there has \(\int\int k(\mathbf{x},\mathbf{x}_i)\phi(\mathbf{x})\phi(\mathbf{x}_i)d\mathbf{x} d\mathbf{x}_i>0.\)

1.1.3 A.3 Definition of Gram matrix (Kernel matrix)

Definition 2

Given a function \({k:\chi\times\chi\to \mathbb{K}}\) (where χ is a compact set of \(\mathbf{R}^n,\) \({\mathbb{K}}\) is the mapped set) and patterns \(\mathbf{x}_1,\ldots,\mathbf{x}_N\in\chi,\) the N × N matrix \(\mathbf{K}\) with elements \(\mathbf{K}_{i,j}:=k(\mathbf{x}_i,\mathbf{x}_j)\) is called the Gram matrix (or Kernel matrix) of k with respect to \(\mathbf{x}_1,\ldots,\mathbf{x}_N.\)

1.1.4 A.4 Property of Kernel function

Theorem 2

Suppose that χ is a compact set of \(\mathbf{R}^n.\) For the continuous and symmetric function \(k(\mathbf{x}_i,\mathbf{x}_j)\) defined on χ × χ, the necessary and sufficient condition for it being a kernel function is that: the Gram Matrix of k with respect to any \(\mathbf{x}_1,\ldots,\mathbf{x}_N\in\chi\) is positive semi-definite.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, R., Kwong, S. & Wang, X. A study on random weights between input and hidden layers in extreme learning machine. Soft Comput 16, 1465–1475 (2012). https://doi.org/10.1007/s00500-012-0829-1

Download citation

Published: 12 February 2012
Issue Date: September 2012
DOI: https://doi.org/10.1007/s00500-012-0829-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study on random weights between input and hidden layers in extreme learning machine

Abstract

Access this article

Similar content being viewed by others

What are Extreme Learning Machines? Filling the Gap Between Frank Rosenblatt’s Dream and John von Neumann’s Puzzle

A Study on the Randomness Reduction Effect of Extreme Learning Machine with Ridge Regression

A Survey on Extreme Learning Machine and Evolution of Its Variants

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 A Kernel function and its determinant

1.1.1 A.1 Definition of Kernel function

Definition 1

1.1.2 A.2 Mercer Theorem

Theorem 1

1.1.3 A.3 Definition of Gram matrix (Kernel matrix)

Definition 2

1.1.4 A.4 Property of Kernel function

Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A study on random weights between input and hidden layers in extreme learning machine

Abstract

Access this article

Similar content being viewed by others

What are Extreme Learning Machines? Filling the Gap Between Frank Rosenblatt’s Dream and John von Neumann’s Puzzle

A Study on the Randomness Reduction Effect of Extreme Learning Machine with Ridge Regression

A Survey on Extreme Learning Machine and Evolution of Its Variants

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 A Kernel function and its determinant

1.1.1 A.1 Definition of Kernel function

Definition 1

1.1.2 A.2 Mercer Theorem

Theorem 1

1.1.3 A.3 Definition of Gram matrix (Kernel matrix)

Definition 2

1.1.4 A.4 Property of Kernel function

Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation