Artificial Neural Networks with Random Weights for Incomplete Datasets

Mesquita, Diego P. P.; Gomes, João Paulo P.; Rodrigues, Leonardo R.

doi:10.1007/s11063-019-10012-0

Artificial Neural Networks with Random Weights for Incomplete Datasets

Published: 06 March 2019

Volume 50, pages 2345–2372, (2019)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Diego P. P. Mesquita¹,
João Paulo P. Gomes ORCID: orcid.org/0000-0003-1686-595X² &
Leonardo R. Rodrigues³

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this paper, we propose a method to design Neural Networks with Random Weights in the presence of incomplete data. We present a method, under the general assumption that the data is missing-at-random, to estimate the weights of the output layer as a function of the uncertainty of the missing data estimates. The proposed method uses the Unscented Transform to approximate the expected values and the variances of the training examples after the hidden layer. We model the input data as a Gaussian Mixture Model with parameters estimated via a maximum likelihood approach. The validity of the proposed method is empirically assessed under a range of conditions on simulated and real problems. We conduct numerical experiments to compare the performance of the proposed method to the performance of popular, parametric and non-parametric, imputation methods. By the results observed in the experiments, we conclude that our proposed method consistently outperforms its counterparts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A sparse linear regression model for incomplete datasets

Article 04 December 2019

Accounting for Imputation Uncertainty During Neural Network Training

On the consistency of supervised learning with missing values

Article Open access 12 September 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abdella M, Marwala T (2005) The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE 3rd international conference on computational cybernetics ICCC 2005, pp 207–212
Braake HAT, Straten GV (1995) Random activation weight neural net (rawn) for fast non-iterative training. Eng Appl Artif Intell 8(1):71–80. https://doi.org/10.1016/0952-1976(94)00056-S
Article Google Scholar
Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355
MathSciNet MATH Google Scholar
Cai J, Candès E, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4):1956–1982. https://doi.org/10.1137/080738970
Article MathSciNet MATH Google Scholar
Cox D, Pinto N (2011) Beyond simple features: a large-scale feature search approach to unconstrained face recognition. Face Gesture 2011:8–15. https://doi.org/10.1109/FG.2011.5771385
Article Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control, Signals Syst 2(4):303–314
Article MathSciNet Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Ding Y, Simonoff JS (2010) An investigation of missing data methods for classification trees applied to binary response data. J Mach Learn Res 11:131–170
MathSciNet MATH Google Scholar
Eirola E, Lendasse A, Vandewalle V, Biernacki C (2014) Mixture of gaussians for distance estimation with missing data. Neurocomputing 131:32–42
Article Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Article MathSciNet Google Scholar
Funahashi KI (1989) On the approximate realization of continuous mappings by neural networks. Neural Netw 2(3):183–192
Article Google Scholar
Garcia-Laencina PJ, Sancho-Gomez JL, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282
Article Google Scholar
Giryes R, Sapiro G, Bronstein AM (2016) Deep neural networks with random gaussian weights: a universal classification strategy? IEEE Trans Signal Process 64:3444–3457
Article MathSciNet Google Scholar
Guo P (2018) A vest of the pseudoinverse learning algorithm. CoRR arXiv:1805.07828
Guo P, Chen PC, Sun Y (1995) An exact supervised learning for a three-layer supervised neural network. In: International conference on neural information processing (ICONIP), Beijing, pp 1041–1044
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article Google Scholar
Hulse JV, Khoshgoftaar TM (2014) Incomplete-case nearest neighbor imputation in software measurement data. Inf Sci 259:596–610
Article Google Scholar
Hunt L, Jorgensen M (2003) Mixture model clustering for mixed data with missing information. Comput Stat Data Anal 41(3–4):429–440
Article MathSciNet Google Scholar
Gheyas IA, Smith LS (2010) A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 73(16–18):3039–3065
Article Google Scholar
Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329. https://doi.org/10.1109/72.471375
Article Google Scholar
Julier SJ, Uhlmann JK (1997) A new extension of the Kalman filter to nonlinear systems. In: SPIE aerosense symposium, pp 182–193
Julier SJ, Uhlmann JK (2004) Unscented filtering and nonlinear estimation. Proc IEEE 92(3):401–422
Article Google Scholar
Kang P (2013) Locally linear reconstruction based missing value imputation for supervised learning. Neurocomputing 118:65–78
Article Google Scholar
Leão BP, Yoneyama T (2011) On the use of the unscented transform for failure prognostics. In: IEEE aerospace conference. IEEE, Big Sky
Li C, Zhou H (2017) svt: Singular value thresholding in MATLAB. J Stat Softw, Code Snippets 81(2):1–13. https://doi.org/10.18637/jss.v081.c02
Article Google Scholar
Li M, Wang D (2017) Insights into randomized algorithms for neural networks: practical issues and common pitfalls. Inf Sci 382–383:170–178. https://doi.org/10.1016/j.ins.2016.12.007
Article Google Scholar
Li Y, Yu W (2017) A fast implementation of singular value thresholding algorithm using recycling rank revealing randomized singular value decomposition. CoRR arXiv:1704.05528
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 5 Jan 2018
Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley, Hoboken
Book Google Scholar
Luengo J, García S, Herrera F (2010) A study on the use of imputation methods for experimentation with radial basis function network classifiers handling missing attribute values: the good synergy between RBFNs and eventcovering method. Neural Netw 23(3):406–418
Article Google Scholar
Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ecm algorithm: a general framework. Biometrika 80(2):267–278
Article MathSciNet Google Scholar
Mesquita DP, Gomes JP, Souza AH Jr, Nobre JS (2017) Euclidean distance estimation in incomplete datasets. Neurocomputing 248:11–18. https://doi.org/10.1016/j.neucom.2016.12.081
Article Google Scholar
Mesquita DP, Gomes JP, Corona F, Souza AH, Nobre JS (2019) Gaussian kernels for incomplete data. Appl Soft Comput 77:356–365. https://doi.org/10.1016/j.asoc.2019.01.022
Article Google Scholar
Mesquita DPP, Gomes JPP, Souza AH Jr (2017) Epanechnikov kernel for incomplete data. Electron Lett 53(21):1408–1410. https://doi.org/10.1049/el.2017.0507
Article Google Scholar
Oliveira PG, Coelho AL (2009) Genetic versus nearest-neighbor imputation of missing attribute values for RBF networks. In: Koppen M, Kasabov N, Coghill G (eds) Advances in neuro-information processing. Springer, Berlin, pp 276–283
Chapter Google Scholar
Pao YH, Phillips SM, Sobajic DJ (1992) Neural-net computing and the intelligent control of systems. Int J Control 56(2):263–289
Article MathSciNet Google Scholar
Pao YH, Park GH, Sobajic DJ (1994) Learning and generalization characteristics of the random vector functional-link net. Neurocomputing 6(2):163–180. https://doi.org/10.1016/0925-2312(94)90053-1
Article Google Scholar
Pelckmans K, Brabanter JD, Suykens J, Moor BD (2005) Handling missing values in support vector machine classifiers. Neural Netw 18(5–6):684–692
Article Google Scholar
Pinto N, Doukhan D, DiCarlo JJ, Cox DD (2009) A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLOS Comput Biol 5(11):1–12. https://doi.org/10.1371/journal.pcbi.1000579
Article MathSciNet Google Scholar
Rudi A, Rosasco L (2017) Generalization properties of learning with random features. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc., pp 3215–3225. http://papers.nips.cc/paper/6914-generalization-properties-of-learning-with-random-features.pdf
Saxe AM, Koh PW, Chen Z, Bhand M, Suresh B, Ng AY (2011) On random weights and unsupervised feature learning. In: Proceedings of the 28th international conference on machine learning ICML’11. Omnipress, Madison, pp 1089–1096
Scardapane S, Wang D (2017) Randomness in neural networks: an overview. Wiley Interdisc Rev: Data Min Knowl Discov 7:e1200
Google Scholar
Schmidt WF, Kraaijveld MA, Duin RPW (1992) Feedforward neural networks with random weights. In: Proceedings, 11th IAPR international conference on pattern recognition, conference B: pattern recognition methodology and systems, vol 2, pp 1–4
Smola AJ, Vishwanathan SVN, Hofmann T (2005) Kernel methods for missing variables. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, pp 325–332
Stosica D, Stosic D, Zanchettin C, Ludermir T, Stosic B (2017) QRNN: $q$-generalized random neural network. IEEE Trans Neural Netw Learn Syst 28(2):383–390
Article Google Scholar
Suganthan PN (2018) Letter: on non-iterative learning algorithms with closed-form solution. Appl Soft Comput 70:1078–1082. https://doi.org/10.1016/j.asoc.2018.07.013
Article Google Scholar
Vidya L, Vivekanand V, Shyamkumar U, Mishra D (2015) RBF-network based sparse signal recovery algorithm for compressed sensing reconstruction. Neural Netw 63:66–78
Article Google Scholar
Wang D, Li M (2017) Deep stochastic configuration networks: universal approximation and learning representation. CoRR arXiv:1702.05639
Wang D, Li M (2017) Stochastic configuration networks: fundamentals and algorithms. IEEE Trans Cyber 47(10):3466–3479. https://doi.org/10.1109/TCYB.2017.2734043
Article Google Scholar
Yu Q, Miche Y, Eirola E, van Heeswijk M, SÃl’verin E, Lendasse A (2013) Regularized extreme learning machine for regression with missing data. Neurocomputing 102:45–51
Article Google Scholar
Ding Z, Fu Y (2018) Deep domain generalization with structured low-rank constraint. IEEE Trans Image Process 27(1):304–313. https://doi.org/10.1109/TIP.2017.2758199
Article MathSciNet MATH Google Scholar
Zhang L, Suganthan P (2016) A survey of randomized algorithms for training neural networks. Inf Sci 364–365:146–155. https://doi.org/10.1016/j.ins.2016.01.039
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the Brazilian National Council for Scientific and Technological Development (CNPq) for the financial support (Grant No. 305048/2016-3)

Author information

Authors and Affiliations

Department of Computer Science, Aalto University, Espoo, Finland
Diego P. P. Mesquita
Department of Computer Science, Federal University of Ceará, Fortaleza, CE, Brazil
João Paulo P. Gomes
Electronics Division, Institute of Aeronautics and Space, São José dos Campos, SP, Brazil
Leonardo R. Rodrigues

Authors

Diego P. P. Mesquita
View author publications
You can also search for this author in PubMed Google Scholar
João Paulo P. Gomes
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo R. Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João Paulo P. Gomes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Unscented Transform (UT)

Given a D-dimensional random variable X, we are interested in estimating statistical moments of $\psi $, which results from an application of a non-linear function $h(\cdot )$ to X. These values could be obtained via standard sampling procedures or numerical integration methods. However, such procedures can be computationally intensive and depend on many factors such as proper initialization, stop criteria, etc. The Unscented Transform (UT) provides a scheme to estimate the moments of $\psi $ using a small set of deterministically chosen samples, referred to as sigma points (SPs), from the space of X.

There are different possible ways to choose the SPs. A common approach is to use a symmetric set of $S = 2D + 1$ SPs as described in Eqs. (32) to (34).

$$\begin{aligned} \gamma _1&= {\mathbb {E}}[X]&\quad&\end{aligned}$$

(32)

$$\begin{aligned} \gamma _s&= \gamma _1 + \left[ \sqrt{D \, \varSigma _{}}\right] _{s-1}&\quad&\forall \, 1 < s \le D + 1 \end{aligned}$$

(33)

$$\begin{aligned} \gamma _s&= \gamma _1 - \left[ \sqrt{D \, \varSigma _{}}\right] _{s - (D + 1)}&\quad&\forall \, D+1 < s \le 2D + 1 \end{aligned}$$

(34)

where $\left[ \sqrt{D \, \varSigma _{}}\right] _s$ denotes the s-th row of the matrix square root of $D \, \varSigma _{}$, which is the covariance matrix $\varSigma _{}$ of X.

Given the SPs and a set of weights $\{k_s\}^S_{s=1} \subset {\mathbb {R}}$, we can approximate the moments of $\psi $ using a simple set of rules. For instance ${\mathbb {E}}[\psi ]$ and $\mathrm {Cov}(\psi )$ can then be approximated using the following equations:

$$\begin{aligned}&\delta _s \leftarrow h \left( \gamma _s\right) \quad \forall \, 1 \le s \le S, \end{aligned}$$

(35)

$$\begin{aligned}&{\mathbb {E}}[\psi ] \approx \sum \limits _{s=1}^S k_s \delta _s \end{aligned}$$

(36)

$$\begin{aligned}&\mathrm {Cov}({\psi }) \approx \sum \limits _{s=1}^S k_s \left( \delta _s - {\mathbb {E}}[\psi ]\right) \left( \delta _s - {\mathbb {E}}[{\psi }]\right) ^T. \end{aligned}$$

(37)

Although there is no restriction on their sign, the weights $k_1, \ldots , k_S$ must respect the convexity constraint.

$$\begin{aligned} \sum \limits _{s=1}^S k_s = 1, \end{aligned}$$

(38)

to provide an unbiased estimate [22]. In this paper, we set $k_1 = k_2 = \cdots = k_S = 1/S$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mesquita, D.P.P., Gomes, J.P.P. & Rodrigues, L.R. Artificial Neural Networks with Random Weights for Incomplete Datasets. Neural Process Lett 50, 2345–2372 (2019). https://doi.org/10.1007/s11063-019-10012-0

Download citation

Published: 06 March 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11063-019-10012-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Neural Networks with Random Weights for Incomplete Datasets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A sparse linear regression model for incomplete datasets

Accounting for Imputation Uncertainty During Neural Network Training

On the consistency of supervised learning with missing values

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Unscented Transform (UT)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Artificial Neural Networks with Random Weights for Incomplete Datasets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A sparse linear regression model for incomplete datasets

Accounting for Imputation Uncertainty During Neural Network Training

On the consistency of supervised learning with missing values

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Unscented Transform (UT)

Appendix: Unscented Transform (UT)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation