Skip to main content
Log in

Soft-Constrained Neural Networks for Nonparametric Density Estimation

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The paper introduces a robust connectionist technique for the empirical nonparametric estimation of multivariate probability density functions (pdf) from unlabeled data samples (still an open issue in pattern recognition and machine learning). To this end, a soft-constrained unsupervised algorithm for training a multilayer perceptron (MLP) is proposed. A variant of the Metropolis–Hastings algorithm (exploiting the very probabilistic nature of the present MLP) is used to guarantee a model that satisfies numerically Kolmogorov’s second axiom of probability. The approach overcomes the major limitations of the established statistical and connectionist pdf estimators. Graphical and quantitative experimental results show that the proposed technique can offer estimates that improve significantly over parametric and nonparametric approaches, regardless of (1) the complexity of the underlying pdf, (2) the dimensionality of the feature space, and (3) the amount of data available for training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Section 3 covers the issue from the practical standpoint.

  2. Or, any other statistical significance test, for that matter.

References

  1. Andrieu C, de Freitas N, Doucet A, Jordan MI (2003) An introduction to MCMC for machine learning. Mach Learn 50(1–2):5–43

    Article  Google Scholar 

  2. Beirami A, Sardari M, Fekri F (2016) Wireless network compression via memory-enabled overhearing helpers. IEEE Trans Wirel Commun 15(1):176–190

    Article  Google Scholar 

  3. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    MATH  Google Scholar 

  4. Castillo E, Hadi A, Balakrishnan N, Sarabia J (2004) Extreme value and related models with applications in engineering and science, Wiley Series in Probability and Statistics. Wiley, London

    Google Scholar 

  5. Cybenko G (1989) Approximation by superposition of sigmoidal functions. Math Control Signal Syst 2(4):303–314

    Article  MathSciNet  Google Scholar 

  6. Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley-Interscience, New York

    MATH  Google Scholar 

  7. Huang CM, Lee YJ, Lin DKJ, Huang SY (2007) Model selection for support vector machines via uniform design. Comput Stat Data Anal 52(1):335–346

    Article  MathSciNet  Google Scholar 

  8. Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, New York

    Book  Google Scholar 

  9. Koslicki D, Thompson D (2015) Coding sequence density estimation via topological pressure. J Math Biol 70(1/2):45–69

    Article  MathSciNet  Google Scholar 

  10. Liang F, Barron A (2004) Exact minimax strategies for predictive density estimation, data compression, and model selection. IEEE Trans Inf Theory 50(11):2708–2726

    Article  MathSciNet  Google Scholar 

  11. Magdon-Ismail M, Atiya A (2002) Density estimation and random variate generation using multilayer networks. IEEE Trans Neural Netw 13(3):497–520

    Article  Google Scholar 

  12. Modha DS, Fainman Y (1994) A learning law for density estimation. IEEE Trans Neural Netw 5(3):519–523

    Article  Google Scholar 

  13. Newman MEJ, Barkema GT (1999) Monte Carlo methods in statistical physics. Oxford University Press, Oxford

    MATH  Google Scholar 

  14. Ohl T (1999) VEGAS revisited: adaptive Monte Carlo integration beyond factorization. Comput Phys Commun 120:13–19

    Article  Google Scholar 

  15. Rissanen J (1978) Modeling by shortest data description. Automatica 14(5):465–471

    Article  Google Scholar 

  16. Rubinstein RY, Kroese DP (2012) Simulation and the Monte Carlo method, 2nd edn. Wiley, London

    MATH  Google Scholar 

  17. Rust R, Schmittlein D (1985) A Bayesian cross-validated likelihood method for comparing alternative specifications of quantitative models. Mark Sci 4(1):20–40

    Article  Google Scholar 

  18. Scholkopf B, Platt JC, Shawe-Taylor JC, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471

    Article  Google Scholar 

  19. Schwenker F, Abbas HM, Gayar NE, Trentin E (eds) (2016) Artificial neural networks in pattern recognition. In: 7th IAPR TC3 Workshop, ANNPR 2016, proceedings, Lecture Notes in Computer Science, vol 9896. Springer, Berlin

  20. Trentin E (2001) Networks with trainable amplitude of activation functions. Neural Netw 14(45):471–493

    Article  Google Scholar 

  21. Trentin E (2006) Simple and effective connectionist nonparametric estimation of probability density functions. In: Proceedings of the 2nd IAPR workshop on artificial neural networks in pattern recognition. Springer, Berlin, pp 1–10

    Google Scholar 

  22. Trentin E (2015) Maximum-likelihood normalization of features increases the robustness of neural-based spoken human–computer interaction. Pattern Recogn Lett 66:71–80

    Article  Google Scholar 

  23. Trentin E (2016) Soft-constrained nonparametric density estimation with artificial neural networks. In: Proceedings of the 7th workshop on artificial neural networks in pattern recognition (ANNPR). Springer, Berlin, pp 68–79

    Chapter  Google Scholar 

  24. Trentin E, Gori M (2003) Robust combination of neural networks and hidden Markov models for speech recognition. IEEE Trans Neural Netw 14(6):1519–1531

    Article  Google Scholar 

  25. Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    Book  Google Scholar 

  26. Weston J, Gammerman A, Stitson M, Vapnik V, Vovk V, Watkins C (1999) Support vector density estimation. In: Scholkopf B, Burges C, Smola A (eds) Advances in kernel methods: support vector learning. MIT Press, Cambridge, pp 293–306

    Google Scholar 

  27. Yang Z (2010) Machine learning approaches to bioinformatics. World Scientific Publishing Company, Singapore

    Book  Google Scholar 

Download references

Acknowledgements

I gratefully acknowledge having been endowed with inspirational support from Alessandra in accomplishing this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edmondo Trentin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Trentin, E. Soft-Constrained Neural Networks for Nonparametric Density Estimation. Neural Process Lett 48, 915–932 (2018). https://doi.org/10.1007/s11063-017-9740-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9740-1

Keywords

Navigation